
- 2 - 

With regard to the new claims 5-20, which relate to 
the 5' subunit of polymerase III holoenzyme, it is well known 
to one skilled in the art that proteins homologous to this 
subunit of polymerase III holoenzyme are contained in 
organisms other than E. coli, as shown in the Declaration of 
Michael O'Donnell under 37 CFR § 1.132 submitted in parent 
U.S. Patent Application Serial No. 08/279,058 on December 17, 
1996 ("O'Donnell Declaration") (copy submitted herewith). 

Those skilled in the art recognize the 6' subunit 
from E, coli has sequence homology to accessory protein 
complexes of various other organisms (O'Donnell Declaration 
U 13). For example, in O'Donnell et al . , "Homology in 
Accessory Proteins of Replicative Polymerases - E. coli to 
Humans," Nucleic Acids Research 21(1): 1-3 (1993) 
("O'Donnell") (copy attached hereto as Exhibit 1), a comparison 
of amino acid sequences shows the homology between proteins of 
replicative polymerases of E. coli, humans, and phage T4 
(O'Donnell Declaration H 13). In Carter, et al . , 
"Identification, Isolation, and Characterization of the 
Structural Gene Encoding the 5' Subunit of Escherichia coli 
DNA Polymerase III Holoenzyme," J. of Bacterioloqv , 
175 (12) : 3812-22 (1993), Figure 5 diagrams the homology of the 
6' amino acid sequence to other replication proteins 
(O'Donnell Declaration % 13). Comparison of the 6' amino acid 
sequence revealed similarity to the Al (replication factor C) 
complex of HeLa cells and to the gene 44 protein (gp44) of 
bacteriophage T4 (O'Donnell Declaration il 13). In addition, 
amino acid sequence similarity was found to the gene product 
of B. subtilis (O'Donnell Declaration ^ 13) . Further, the 
structural homology of the 6' subunit to other replication 
proteins has been proven to be true (O'Donnell Declaration 
H 13) . For example, the genome project of Haemophilus 
influenza showed homologues to all 10 subunits of E. coli DNA 
polymerase III holoenzyme, including 6, 6', X/ ^/ and d 
(O'Donnell Declaration % 13). Currently, the GenBank now also 
shows homologues to the 5' subunit of E. coli from a large 
variety of organisms, including the following: Procaryotes : 
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Escherichia coli, Haemophilus influenze, Micrococcus luteus , 
Pseudomonas aeruginosa. Bacillus subtilis , Caulobacter 
crescentus ; Archaebacteria : Thermus thermophilis (extreme 
thermophile) ; Eukaryotes : Drosophila melanogaster (fly, 
insect) , Caenorhabditis elegans (namatode, worm) , Gallus 
gallus (dog) , Homo sapien (man) , Saccharomyces cerevisiae 
(yeast) , and Saccharomyces pombe (yeast) (O'Donnell 
Declaration i| 13) . 

Further, the sequence of the human homologues to 6', 
and indeed the other 5' homologues, are sufficiently 
homologous to the 6 ' subunit of E. coli to provide for 
identifying and obtaining the corresponding 5' (holA) gene 
from these organisms using the gene encoding the 6' subunit of 
E. coli in the following ways: (1) use of the E. coli holA 
gene, or fragments of the E. coli gene, as a probe in a 
Southern analysis of whole cell DNA from another organisms to 
identify the corresponding 5' homologue; (2) use of holA, or 
its fragments, as a probe to screen cDNA plasmid libraries of 
other organisms; (3) use of the holA gene sequence to 
synthesize oligonucleotide primers for PCR to amplify the 
corresponding 6' homologue from total genomic DNA from other 
organisms; and (4) use of the holA gene sequence to identify 
the 5' homologue from a genome sequencing project of other 
organisms by sequence comparison to the E, coli holA gene 
(O'Donnell Declaration H 14) . 

The present application fully discusses the 
isolation and sequencing of the 5' subunit and its encoding 
genes for the polymerase III holoenzyme. In view of the 
disclosure of these experimental procedures, and the known 
structural and functional homology of the 6' subunit proteins 
from various sources such as yeast RFC subunits, human RFC 
subunits, bacteriophage T4 gp44 subunits, and DnaH subunits of 
B. subtilis , it would not require an undue amount of 
experimentation for one skilled in the art to isolate and 
sequence the claimed 5' protein (and its encoding gene) from 
sources other than E, coli. 
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With regard to the new claims 21-75, which relate to 
the protein subunits other than 5' , it is well known to one 
skilled in the art that proteins in other organisms have 
functional and structural homology to the subunits of E, coli 
(O'Donnell Declaration 10-16) . 

As discussed above, in O'Donnell a comparison of 
amino acid sequences shows the homology between proteins of 
replicative polymerases of E, coli, humans, and phage T4 . 
Further, in Sanders et al . , "Rules Governing the Efficiency 
and Polarity of Loading a Tracking Clamp Protein Onto DNA: 
Determinants of Enhancement in Bacteriophage T4 Late 
Transcription," The EMBQ Journal 14 (16) : 3966-76 
(1995) ("Sanders") (copy attached hereto as Exhibit 2), the 
common elements of structure and function of replicative DNA 
polymerases of eukaryotes, prokaryotes, and certain viruses 
are discussed (O'Donnell Declaration % 12). It is disclosed 
that the replicative DNA polymerases of all of these sources 
are composed of a core enzyme and a set of accessory proteins 
(O'Donnell Declaration H 12) . Further, Stillman, "Smart 
Machines at the DNA Replication Fork," Cell 78:725-28 
(1994) ("Stillman") (copy attached hereto as Exhibit 3) 
discusses the functional similarity of proteins from E, coli, 
humans, and phage T4 that cause replication (O'Donnell 
Declaration H 12) . Specifically, these exhibits show that E. 
coli contains an accessory complex called y complex which 
contains the subunits y, 5, 5', ^, and x (O'Donnell 
Declaration H 12) . Further, these exhibits show that 
homologous proteins to the y complex are also present in 
eukaryotic (containing RFC complex) , phage T4 (containing g44 
complex) , and human (containing RFC complex) organisms 
(O'Donnell Declaration II 12) . 

As further discussed in these references, all 
cellular replicases known to date utilize a DNA sliding clamp 
that encircles DNA and acts as a mobile tether to hold 
continuously the DNA polymerase to DNA as it moves along it. 
The clamp requires an accessory protein to assemble the 
sliding clamp around the DNA and, therefore, a clamp loader 
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that couples ATP hydrolysis to assemble the clamp onto DNA for 
use by the polymerase is required. 

As disclosed in O'Donnell, Stillman, and Sanders, 
the clamp loader complex of E. coll is the y complex, the 
clamp loader complex in human and eukaryotic organisms is the 
RFC complex, and the clamp loader complex in phage T4 is the 
g44 protein complex. These clamp loader complexes all contain 
5 subunits, and are functionally similar in utilizing ATP to 
transfer their respective clamp onto DNA. Further, the 
subunits of the clamp loaders are homologous in their amino 
acid sequence from E, coli to humans, including yeast and 
bacteriophage T4 (See O'Donnell) . 

As also discussed in O'Donnell, the clamp of E. coli 
is called the (3 subunit . The crystal structure of jS shows it 
is a ring shaped protein with 6 globular domains that encircle 
DNA. The functionally homologous clamp from yeast and humans 
is called proliferating cell nuclear antigen ("PCNA") and the 
functionally homologous clamp from bacteriophage T4 is called 
gp45. The structural homology of (3 to PCNA from yeast and 
humans and the phage T4 gene 4 5 protein was pointed out in 
Kong et al . , "Three -Dimensional Structure of the 13 Subunit of 
E. coli DNA Polymerase III Holoenzyme : A Sliding DNA Clamp," 
Cell, 69:425-437 (1992) ("Kong") (copy attached hereto as 
Exhibit 4) , Further, the structural similarity was proven 
with the crystal structure of yeast PCNA (Krishna et al . 
"Crystal Structure of the Eukaryotic DNA Polymerase 
Processivity Factor PCNA", Cell 79:1233-43 (1994) ("Krishna") 
(copy attached hereto as Exhibit 5) . Like ^8, PCNA is ring 
shaped with six domains and these domains have the exact same 
chain topology folding pattern as the E. coli (3 subunit clamp. 
Id. . The crystal structure of the bacteriophage T4 clamp gp4 5 
shows it too is a 6 domain ring having the same chain fold as 
the clamps of E, coll, yeast, and humans. Examination of the 
GeneBank sequence information shows that all identified 
bacterial 13 subunit genes (the dnaN gene) are homologous to 
the (3 subunit of E. coli (Fig. 2 in Kelman, et. al . , 
"Structural and Functional Similarities of Prokaryotic and 
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Eukaryotic DNA Polymerase Sliding Clamps, " Nuc . Acids Res. 
23:3613-20 (1995) ("Kelman") (copy attached hereto as 
Exhibit 6). These organisms are: Bacillus subtilis. 
Micrococcus luteus, Streptomyces coelicolor, Pseudomonas 
putida, Serratic mar cense ens , Salmonella typhimurium, Proteus 
mirabilis , Mycoplasma capricolum and Actinobacillus 
pleuropneumoniae , To carry the (3 to PCNA homology further, 
several eukaryotes have homologues of the clamp including: 
human, mouse, rat, frog {Xenopus laevis) , fly {Drosophila 
melanogaster) , Catharanthus roseus, carrot (Daucus carrota) , 
Glycine max, Oriza sativa, Saccharomyces pombe, Saccharomyces 
cerevisiae, and Autographa calif ornica (Fig. 3 in Kelman) . 
Thus, the accessory proteins (i.e. the clamp and clamp loader) 
of £?. coli, bacteriophage T4 , yeast, and humans are known by 
those of ordinary skill in the art to have functional 
similarity. 

Further, the functional similarity between the 
complexes of various organisms was demonstrated where the 
phage T4 proteins (clamp, clamp loader, and polymerase) were 
able to substitute for the human homologues in an in vitro 
assay of the SV40 replication system (Tsurimoto, et al . , 
"Functions of Replication Factor C and Proliferating Cell 
Nuclear Antigen: Functional Similarity of DNA Polymerase 
Accessory Proteins From Human Cells and Bacteriophage T4 , " 
PNAS, 87:1023-1027 ( 1990 )( "Tsurimoto" ) (copy attached hereto 
as Exhibit 7) (O'Donnell Declaration K 11) . 

Further, the sequence homology of the 5' subunit and 
the y subunit in E. coli is disclosed in Dong et al . , "DNA 
polymerase III accessory proteins, I. holA and holB encoding 6 
and 6'" J. Biol, Chem. , 268:11758-65 (1993) ("Dong") (copy 
attached hereto as Exhibit 8) . This homology in two subunits 
of polymerase III holoenzyme suggests that these were partners 
rooted in a common ancestor gene which duplicated and then 
endured slow successive changes. Thus, it would be known by 
those of ordinary skill in the art that this homology in 
particular would be carried out in other organisms. Indeed, 
the homology of the 6' and 7 subunits extends to all five 
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subunits of the RFC clamp loading complex of yeast and of 
humans (O'Donnell) . They all have homology to the 5' and 7 
subunits of the E. coli y complex. (Ijd^) . Yeast and humans are 
the two most highly advanced eukaroyotic replication systems. 
The fact that yeast is a lower eukaryote and human is a higher 
eukaryote and that the two systems span a large evolutionary 
distance makes it likely that everything in between also will 
be homologous . 

Accordingly, those of ordinary skill in the art 
would have been able to isolate the claimed protein subunits 
(and their encoding genes) from sources other than E. coli. 

The rejection of claim 1 under 35 USC § 102(b) as 
anticipated by Takase et al . , "Genes Encoding Two Lipoproteins 
in the leuS-dacA Region of the Escherichia, coll Chromosome, " 
J, of Bacteriology 169 (12) : 5692-99 (1987) ("Takase") is 
respectfully traversed. 

Takase relates to the coding of two lipoproteins by 
two genes, rlpA and rlpB, located in the leuS-dacA region on 
the Escherichia coll chromosome (O'Donnell Declaration H 17). 
The rlpA gene encodes for a lipoprotein having molecular 
weight of 36K (O'Donnell Declaration H 17) . Figure 6 of the 
reference details the sequence of the 36K lipoprotein gene 
rlpA and its 5'- and 3'- flanking regions and the amino acid 
sequences deduced from the nucleotide sequence (O'Donnell 
Declaration 17) . The position of the PTO is that this 
sequence matches that of the sequence encoding the claimed 6 
subunit . Applicants respectfully disagree. Takase also 
discloses a rlpB gene of the E. coll chromosome (O'Donnell 
Declaration H 17) . At the end of the sequence of the rlpB 
gene shown in Figure 7, the last 23 0 base pairs constitute a 
sequence that encodes the first 20-25% of the holA gene 
sequence (O'Donnell Declaration ^ 17) . Takase did not 
recognize this to be an open reading frame of a putative 
unknown gene, nor did the reference disclose the complete 
sequence of the gene (O'Donnell Declaration H 17). The 
diagram attached as Exhibit 4 to the O'Donnell Declaration 
(diagram attached hereto as Exhibit 9) shows the overlap 
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between the disclosed rlpB gene of Takase and the holA gene 
encoding the claimed 6 subunit (O'Donnell Declaration H 17). 
Further, as shown in Dong, et al . , "DNA Polymerase III 
Accessory Proteins," J. Biological Chem. , 268 (16 ): 11758-765 , 
11759 n. 3 ("Dong"), Takase's published sequence was incorrect 
and incomplete, in fact, the first 54 nucleotides of the 6 
gene are incorrect by 11 nucleotides (O'Donnell Declaration 
% 17) . Thus, the 6 protein subunit of polymerase III 
holoenzyme and the gene encoding the 5 protein subunit of the 
polymerase III holoenzyme of the present invention are not 
disclosed by Takase. 

Claim 54 relates to " [a] n isolated protein subunit 
of polymerase III holoenzyme, wherein the subunit group is 6." 
Further, claim 59 relates to " [a] n isolated DNA molecule 
encoding a protein subunit of polymerase III holoenzyme, 
wherein the subunit group is 6 . " Takase does not teach the 
specified isolated protein subunit of polymerase III 
holoenzyme, nor the gene encoding that protein. Further, 
Takase does not disclose the claimed expression system or host 
cell. Therefore, the rejection based on Takase is improper 
and should be withdrawn. 

The rejection of claim 1 under 35 USC § 102(b) as 
anticipated by Stirling et al . , "xerB, an Escherichia coli 
Gene Required For Plasmid ColEl Site-Specific Recombination, 
Is Identical to pepA, Encoding Aminopeptidase A, a Protein 
With Substantial Similarity to Bovine Lens Leucine 
Aminopeptidase," The EMBO Journal 8:1623-27 (1989) 
("Stirling") is respectfully traversed. 

Stirling relates to the xerB gene from E, coli and 
the purification of the 55.3 kd xerB polypeptide. Figure 2(A) 
discloses the DNA sequence of the xerB gene which is located 
in a nucleotide sequence of a slightly larger fragment. 
Stirling discloses that the xerB gene is a 503-codon open 
reading frame which would encode a polypeptide of 55.3 kd. 
The position of the PTO is that this sequence matches that of 
the sequence encoding the claimed x subunit. Stirling, 
however, does not disclose the polymerase III holoenzyme nor a 
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protein subunit of the polymerase III holoenzyme . In 
addition, Stirling only recognized the start of the gene open 
reading frame, but did not recognize the complete isolated DNA 
molecule or the complete protein it encodes . 

Claim 43 relates to " [a] n isolated protein subunit 
of polymerase III holoenzyme, wherein the subunit group is x-" 
Further, claim 47 relates to " [a] n isolated DNA molecule 
encoding a protein subunit of polymerase III holoenzyme, 
wherein the subunit group is x-" Stirling does not teach the 
specified isolated protein subunit of polymerase III 
holoenzyme, nor the gene encoding that protein. Further, 
Stirling does not disclose the claimed expression system or 
host cell. Therefore, the rejection based on Stirling is 
improper and should be withdrawn 

The rejection of claim 1 under 35 USC § 102(b) as 
anticipated by Yoshikawa et al . , "Cloning and Nucleotide 
Sequencing of The Genes rimi and rimJ Which Encode Enzymes 
Acetylating Ribosomal Proteins S18 and S5 of Escherichia coli 
K12," Mol . Gen. Genet. 209:481-88 (1987) ("Yoshikawa") is 
respectfully traversed. 

Yoshikawa relates to the riwi gene of E. coli 
which encodes for the RimI enzyme, which was deduced to 
contain 161 amino acid residues with a calculated molecular 
weight of 18,232. The nucleotide sequence of rimI is shown in 
Figure 3 of the reference. Yoshikawa did not recognize the 
segregated ^ subunit or the segregated gene encoding the ^ 
subunit. As pointed out in the present application, Yoshikawa 
"... does not indicate any appreciation of the gene as a 
coding sequence for the ^ peptide" (page 56, lines 3, 4) . The 
holD gene actually overlaps the disclosed riml gene for 32 
nucleotides. In other words, the last 32 nucleotides of holD 
(29 that encode amino acids, and 3 that encode the stop 
signal) are embedded within the first 32 nucleotides of the 
riml protein disclosed in Yoshikawa. Yoshikawa does not, 
however, disclose this segregated sequence. In fact, 
Yoshikawa examined protein expression from the riml gene, and 
did detect riml protein, but failed to detect the segregated 
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holD {^) protein expression even though the riml gene 
contained the holD gene . 

By contrast, claim 32 relates to " [a] n isolated 
segregated protein subunit of polymerase III holoenzyme 
wherein the subunit group is Claim 36 relates to " [a] n 

isolated DNA molecule encoding a segregated protein subunit of 
polymerase III holoenzyme wherein the subunit group is ^ " . 
None of these limitations are disclosed in Yoshikawa. 
Further, there is no disclosure of an isolated protein, 
"wherein the protein corresponds to an amino acid sequence 
corresponding to SEQ. ID. NO. 38" as recited in claim 3 3 or an 
isolated DNA molecule "wherein the DNA molecule corresponds to 
a nucleotide sequence corresponding to SEQ. ID. NO. 39" as 
recited in claim 37. There is also no disclosure of the 
claimed expression system and host cell. Thus, the rejection 
based on Yoshikawa is improper and should be withdrawn. 

The provisional rejection of claim 1 under the 
judicially created doctrine of obvious-type double patenting 
is respectfully traversed in view of the cancellation of 
claim 1 in the preliminary amendment dated August 5, 1997. 
Accordingly, this rejection should be withdrawn. 

In view of the foregoing, it is submitted that this 
case is in condition for allowance, and such allowance is 
earnestly solicited . 

Respectfully submitted, 

Michael L. Goldman 
Registration No. 30,727 
Attorney for Applicant 

Nixon, Hargrave, Devans & Doyle LLP 
Clinton Square, P. O. Box 1051 
Rochester, New York 14 6 03 
Telephone: (716) 263-1304 
Telecopy: (716) 263-1600 
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The basis for the remarkably high processivity of DNA poly- 
merases that duplicate long diromosomes appears quite similar 
in prokaryotes and eukaryotes. In each of these cell types, the 
rq)licative polymerase has several accessory proteins which endow 
the polymerase subunit with its speed and processivity. The 
rqplicative polymerases of the well Studied systems of bacterio- 
phage T4, E.cqli (DNA polymerase in holoenzyme) and humans 
^lymerase 5), contam accessory proteins whidi form a 'sliding 
clanq)' on DNA that acts to tether the polymerase to the DNA 
for rapid and highly processive synthesis (1—4). In the E.coU 
system this sliding clamp has been shown to be a duner of the 
j3 subunit, which is in the shj^ of a ring Midrding the DNA 
(5). The fimctional homologue of jS in the T4 system is the product 
of gene 45 (g45 protein) and in humans it is the proliferating cell 
nuclear andgen (PCNA). These proteins are homologous in 
funcdon^ and although they show no homology at the amino acid 
sequence level, a case has recently been made for a structural 
similarity of i3 to PCNA and to the T4 g45 prptein on the basis 
of sequence using the crystal structure of as a- guide (5). 

These replicative polymerases each require several odier 
accessory proteins to assemble the sliding danq> arpurKl the DNA. 
A list of these proteins is presented m T^le L The requirement ; 
for more than one protein in this reaction may.refiect a rieed for : 
several functioiis, including recognition of a prinied template, 
the opening and closing of the ring shaped clainp protein around 
the DNA, and the coupling of this process to ATP hydrolysis. 
Inthe£loaff syst^this *clanq) loader* is the S^rotdn 7 coriiplex - 
(yd8'x^)t in humans it is the 5-protem Activator l .(AI), also 
referred to as RF-C, and in phage T4 it is the 5 siibunit g44 
proteiii/g62 protein complex (the g44/62 complex). 

The mechanism of the *clan^ loader.' in these three systems 
may be similar. The sites of buiding to the primer-template 
junction are similar for the E, coU y complex ahd'/S subunits, the 
human Al and PCNA, and the g44/62 and g45 proteins (8-11). 
In the case of Kcoli, the loading of the /3 subunit forms a 
*preinitiation conq)lex', dq)endent upon ATP hydrolysis, that can 
be isolated by gel filtration (12,13). The action of the 7 complex 
in this reaction can be substituted by pairs of the subunit?, 76, 
or t6' (14). The preinitiation complex assembled by the yeast 
or human Al and PCNA and dependent upon ATP hydrolysis 
can also be isolated by gel filtration (7,15). The complex 
assembled by the T4 protems on primed DNA differs, in that 
while it can be cross-linked to the DNA, it is not stable enough 



to be isolated after gel filtration; a complex containing the g45 
can only be isolated by gel filtration if the DNA polymerase (g43) . 
is present with the g44/62 and ATP (9,16). 

The ATPase activity of the multisubunit complex in each of 
the three systems is required to assemble the 'sliding clamp' and 
the polymerase onto the template, but is not reqmred for the 
subsequent replication of long stretches of the template in vitro. 
This indicates that ATP hydrolysis by the accessory protein 
complex is not required for translocation by the polymerase, at 
least along a template unencumbered by other boimd proteins. 

Recently, the genes encoding most of these subunits have been 
identified. Comparison of the amiiio acid sequences in Fig. 1 
shows several of these subimits are truly homologous between 
KcoU, humans and phage .T4, especially in one region toward 
the middle (110-160), hnplying for the first tune that the 
functional similarity among these systems has. its basis in an 
evolutionarily conserved structure. Presumably all these proteins 
evolved from the same ancestral gene. 

The homology between these protein subunits actually starts 
near the amino terminus, /continues thtbugh ;&e ATP-bu^ 
domain which starts approximatelyTat tiie is^^ *GKT' box, 
and^extends for about two hundred 'amino acid i^idues toward 
their carboj^l termini. All these proteiris have a strictiy conserved 
*SR^;' sequence of unknown fiin^ 157-159. The 

presence of die four amino-acid 'DEAD' motif in three of the 
human protein subunits represent an unusuaHaccurrence of this 

Table I. \ . 





EcoU 


Eukaiyotic 


phage T4 


DNA 


Pol m core 


pol5 


g43 


polymerase 




(pole)* 




Accessory 


y coiiq)lcx 


Activator 1 (Al) 


g44/62 


complex 




(aslo called RF-C) 




Accessory 




PCNA 


g45 


protein 








clamp 









tThe lOtb subunit pf£.co£i DNA polymerase holoenzyme, r» binds together two 
molecules of pol ffi core (24, 25) presumably for coordinated synthesis of both 
strands of duplex DNA. 

*Pol€ is a different molecule from pol5 and it is also activated by the eukaiyotic 
accessory proteins (6,7). 
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sequence in proteins not having RNA-dependent ATPase or 
putative RNA helicaise activities. However, as mentioned above, 
ATP hydrolysis by these protein complexes is not required for 
translocation along the DNA template by the polymerase. 

An interesting feature to emerge from the sequence 
comparisons is the similarity among subunits present within o^ 
polymerase accessory factor. In E.coli, the 6' subunit shares 27% 
identity to the 7 and t subunits; 7 and t are derived from the 
same gene and therefore have identical sequences (a translational 
frameshift produces 7 (47 kDa) which is 24 kDa shorter than 
r ref. 23). The gene encoding 6' (holE) also produces two 
subunits, 8'^ and 6' urge, which differ by 520 da. 6' souoi is 
die product of the full gene and therefore 5' largc is produced x>y 
a modification, possibly a translational ftameshift. In humans, 
four subunits of the Al are highly homologous to one another. 
Hwice, the 'clamp loader* complexes of KcoU and hmna^ 
Mmear to contain a f armly of subunits of sirnilar structure within 
them. Although the phage T4 g44 protein is not homologous to 
g62 protein, the subunit composition of the g44/g62 complex is 
4:1 (g44-to-g62) (17). Thus, the g44/62 complex also has 
structural redundaricy and may be compared to the case of the 
human proteins, in which four different protems having 
homology, the 40. 38, 37, and 36 kDa subunits. are present in 
the complex with one additional subunit, of 145 kDa, tiiat has 
litfle homology to the other four. 

Why fliere is such structural redundancy within these complexes 
is unclear at present. However, these sequence alignments 
underscore the basic similarity in structure and function of the 
cellular replicase accessory proteins across the evolutionary 
spectrum. 
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Rules governing the efficiency and polarity of 
loading a tracking clamp protein onto DNA: 
determinants of enhancement in bacteriophage T4 
late transcription 



Glenn M.Sanders, George A.Kassavetis and 
E.Peter Geiduschek 

Department of Biology and Center for Molecular Genetics, 
University of California, San Diego, U Jolla, CA 92093-0634. USA 

The bacteriophage T4 DNA polymerase accessory 
proteins confer processiWty and high speed on rep- 
licative DNA chain elongation: the gene 45 protein, 
gp4S, tracks along DNA and serves as the sliding clamp 
of the idral DNA polymerase; the. gene 44/62 protein 
complex, gp44/62, is an ATP-dependent loading enzyme 
that mounts gp45 on DNA, Gp45 also activates T4 late 
transcription. TVanscriptional enhancement by gp45 
requires a particular orientation that is imposed by 
gp44/62 at the DNA loading site. Loading and orienting 
gp45 on DNAt tracking along DNA and interaction vfith 
RNA polymerase have been analyzed by measuring 
transcriptional activation. The efficiency of loading 
gp45 at different DNA structures and the resulting 
transcriptional activation have been compared, and 
sources of interference witti transcriptional activation 
have been examined. All observations are compatible 
with a mechanism in which the loading en2yme recog- 
nizes the polarity of single-stranded DNA and imposes a 
corresponding polarity of DNA entry on gp45. Primer- 
template junctions are the most efficient DNA loading 
sites for gp45 and can generate very rapid opening at 
promoters that are located at a distance of >1 kbp. In.. ^ 
cohtrastvgp45 does not track fefficientiy across single-lJK^ 
stranded DNA. ''"'■'■^ ;v.-- /^ - • ■ • 

Keywonls; DNA tracking proteins/enhariccrs/gene regula- 
tioVpro^l^siviiy factoi^ late genes . , ; :^ 



introduction ' : \ /'\-'\/ ' 

The prim^ rcplicaUve DNA of eiikaryot^g^ 

prokarybtcs and certain vimses share cooimon elements 
of stmcture and function: they arc composed of a core 
enzyme;^ ^capable "of 'rclativcljr>^ rio^ 
synthesis and cxonucleolytic degrsdatioh,' aiid a set of 7; ; 
accessory proteins that confer process! vity on the core "* 
enzyme (Komberg and Baker, 1992; Kuriyan and_ 
O'DonnelK 1993; NossaU 1994). The accessory protems ' 
comprise two components: a processiyiiy factor proper, 
the ^sliding clamp;, and an ATP-deperident assembly 
factor, the *clamp loader'. The smictures of two sliding 
clamps, the P subunil of Escherichia coli DNA polymerase 
III holoenzyme and yeast proliferating cell nuclear antigen 
(PCNA), have been determined: they are toroidal multi- 
mers, capable of encircling DNA (Kong et at, 1992; 
Krishna et aL 1994). Indeed, cukaryotic (PCNA), bacterial 
(P) and viral [bacteriophage T4 gene 45 protein (gp45)l 
clamps have been shown to 'track', that is to diffuse freely 



on DNA although confined topologically, once they have 
been loaded onto DNA at an appropriate site by the 
cognate 'clsJmp loader' (Stukenberg et aL, 1991: Burgers 
andYoder. 1993;TmkercM/., 1994b; Podust era/., 1995). 
How the clamp loaders, clamps and DNA loading sites 
interact can also be understood through analysis of DNA 
replication and dirough measurements of the ATPase 
activities of these clamp loaders (reviewed by Young 
et oL, 1992, 1994; Nossal, 1994). 

A role for one set of DNA polymerase accessory 
proteins, in r transcriptional activation has been identified 
lecentlytper^^ttmg exanuTiation of accessory protein load- . 
ing anjl tracking in the context of transcription. The late 
genes oiFlMu:teriophageT4 are transcribed from extremely . 
simpler promoters, consisting of TATAAATA centered 
-10 bp upstream of a transcriptional start site. The ability 
to recognize these prornotcrs is conferred on Rcoli RNA 
polymerase.core by a small a-family protein encoded by 
T4 gene 55 (gp55), but transcription is weak, particulariy 
in relaxed or linear DNA, and further suppress^ by the 
small RNA polymerase-bound t4 gene 33 protein. Gp33 
is a transcriptional co-activator; it confers the abili^ to 
support activation of transcription by gp45, the *sliding 
clampV of the T4 DNA polymerase holoenzyme 
(Herendeen etal., 1990). Altiiough gp45 alone can activate 
gp55:4irected traiiscription in the presence of gp33 imder 
cohditiorlS^of rriacfpmolecular crowding (Sanders cr a/., 
lS)94), ;Under:COiiventional reaction conditioris it addition- 
V:aUy requires. Ae /cljunp^.^to by T4 genes 44 

■ f and 62 :(the:gp44/62 complex). The latter lojads gp45 onto 
": DNA in an ATP hydrolysis-dependent process at enhancer- 
like etitry sites that can be located at a considerable distance 
• £romAthe^^ by Brody e/ oL, 1995). 

^^^^ : along DNA, 

- encounters : RNA : polymerase, and eventually . becomes 
stably associated with the upstream end of the activated 
J^ppen rcbn^lcx (Tmker aJL, 1994a). Tracking along 
DNA.^is an i^ntisd part of this transcriptional activation 
inechanisnrita continuous and open path along DNA 
' connecting an enhancer and its promoter is required for 
kctivation by the :T4 DNA polymerase accessory proteins 
(Herendeen at; 19W)1 . 

The most studied enhancers of T4 late transcription 
have been nicks in DNA, which can activate transcription 
from upstream, or downstream of a target promoter. A 
characterisdc polarity distinguishes these ci^-acting sites 
from conventional enhancers: the nick must be in the non- 
transcribed strand of its target transcription imit 
(Herendeen et oL, 1989). It has been suggested that 
polarity of transcriptional activation is due to assembly of 
an asymrnctric gp44/62-gp45 complex at the nick-as- 
enhancer, in^sing a partictilar orientation on gp4S as it 
tracks along DNA and thereby determining the orientation 
of RNA polymerase that is compatible with formation of 
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Fig. 2. Efficiencies of transcriptional activation generated by different 
gp45 loading sites. A mixture of T4rnK)dific<I RNA polymerase, gp55, 
gp33 and the DNA polymerase accessory proteins (gp44/62 complex 
and gp45) was added to pDH3IO (shown at the boaom of the figure) 
DNA, gp32 and dATP pre-cqiiilibrated at AUquots were 
withdrawn at the times noted on the abscissa and added to a mixture 
of unlabeled and radioactive ribonucleosidc triphosphates and 
rifampicin pre-equilibraied at 25'C for a single round of transcription. 
The yield of 420 nt RNA is reported on the ordinate (in arbitrary 
units). Further cxperintental details are provided in Materials and 
methods, (a) Comparison of primer-template and 3'-tailcd junctions 
with nicked circular DNA. (■): linear DNA with 5' overtianging 
(primer-template junction) ends; (•): linear DNA with 3*-tailed ends; 
(A): nicked circular DNA: (A): unenhanced transcription (accessory 
proteins omitted). The distances from the relevant gp45 loading sites 
(see below) to die enhanced promoter arc -1520 bp, -1070 bp and 
-220 bp for the primer-template junction. 3' tail and nick, 
respeaively. (b) Effects of gp32 and E.col\ SSB (SSBe) on the 
efficiency of transcriptional activation. DNA with 5' overhanging 
(closed symbols) and 3' overhanging (open symbols) ends, without 
SSB (circles), witfi SSBe. (triangles) or v>riih gp32 (squares) was 
analyzed, (c) Effects of gp32 and SSBe on activation of transcription 
of DNA with 5' overhanging ends, at one-fourth of the concentration 
of each of the DNA polymerase accessory proteins used in (b). Gp32 
(closed symbols) and SSBe (open symbols) were used at 100 fig/ml 
(triangles) or 50 ^g/ml (squares). 



the transcriptioYf initiation inhibitor rifampicin. The assay 
measures the end-pr(xluct of a complex reaction sequence, 
which requires an adequate measure of thermal equilibra- 
tion, assembly of a gp44/62-gp45 complex at a loading 
site, entry of gp45 for tracking along DNA, interaction 
with RNA polymerase, pronioter location and opening. 
Nevertheless, primer-template junctions generate half, 
maximal promoter opening within 1-2 min, at modest 
concentrations of the DNA polymerase accessory proteins 
(Figure 2a, closed squares). Promoter opening of nicked 
circular templates and of templates bearing recessed 5' 
ends (i.e' 3' tailed) is relatively slow, with 50% rise times 
of -10 niin and >30 min, respectively (Figure 2a, closed 
triangles and closed circles, respectively), Unenhanced 
transcription is negligible under these conditions (open 
triangles). Thus, the most effective of these enhancers 
(see below) is a primer-template jun(:tion that is located 
-1.5 kbp firom the activated promoter. 

SSB proteins affect the ability of single-handed 
DNA regions to generate enhanced transcription 

The preceding experiments were carried out in the presence 
of gp32, the T4-encoded SSB. The effect of omitting gp32 
or of substituting the E.coli SSB (SSBe) on activation of 
transcription has also been examined (Rgure 2b). In the 
absence of any SSB, 3'-tailed DNA docs not generate 
activated transcription (open circles), and SSBe does not 
rescue activation (open triangles), implying that the lack 
of transcription is not merely due to sequestration of 
transcription components by single-stranded DNA. Sig- 
nificant rates of promoter opening, with 3' tails are only 
obtained in the presence of gp32 (open squares). In 
contrast, primer-template junctions generate modest rates 
of promoter opening in the absence of any SSB (closed 
circles), and the rate of promoter opening is greatiy 
increased by providing SSB^ (closed triangles). Gp32 
further increases the rate of promoter opening (closed 
squares). SSBe and gp32 have no effea on transcription 
of nicked, circular DNA (data not shown). Thus* a single- 
stranded region of DNA is necessary for the SSBs to 
manifest their effects on activation of transcriptioiu 

The profound effect of both SSBe and gp32 on enhance- 
ment of transcription from a primer-template junction 
implies that simply niasking exposed single-stranded DNA 
gready improves loading oiF gp45. The next experinaeht 
(Figure 2c) looks for a more specific role of SSB in gp45 
loading by examining quantitative differences in die ability 
of gp32 and SSBe to facilitate transciiptional enliancement 
Since SSBe and gp32 differ greatly ^^m their modes of 
DNA associatioti (Lohman and Ferrari, 1994). difprrences 
in transcriptional activation could also reflect different 
degrees of sattuation of single-stranded DNA. Rates of 
proinoter opening were cxainined at limiting concentra- 
tipris bf the DNA polymerase accessory proteins (in order 
to slow down the otherwise rapid rate of open complex 
forrnation and accentuate differences between gp32 and 
SSBe promoting enhariced transcription). At their 
respective saturation limits (excluding the possibility that 
quantitative differences . in activation reflect, different 
(legrees of saturation of single-stranded DNA with SSB), 
gp32 is a substantially m(wc effective co-factor than SSBe 
for enhancement of transcription from a primer-template 
junction (Figure 2c, compare closed symbols with open 
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Fig. 4. Downsueam promoters and upstream loading sites can interfere 
with activation of transcription, (a) Single-round transcription of Imcar 
pDH72AEl DNA (Figure 3a) with 5' overhanging ends, and two T4 
late transcription units in tandem (open triangles), compared with a 
similar template (pDH72M20) from which the right (420 nt) promoter 
has been deleted (filled triangles). The yields of 314 nt RNA from the 
left proinoters w compared as a function of RNA polymerase 
concentratioiu (b) Singlc-rouiid transcription of linear: pDH72AEl with 
primerHe'ihplateJuiiaiori lo^ sites for gp45 at both ends (open 
triangi«)l-br only at di^ downstream end (filled triangles). 

Tlie ratio of the molar yield of left promoter-derived RNA (314 nt) to 
right promoter-derived RNA (420 nt) U compared as a ftinction of 
DNA polynaerase accessory protein concentration, one unit 
corresponding to 1 60 nM gp 44/62 complex with 90 nM gp45. 

from its downstream, but not its upstream, gp45 loading 
site. Plasmid pGS724 has two divergent T4 late ^transcrip- 
tion units, separated by a single EcoRl site. Addition of 
£ci7RI-Glnlll to this template isolates each promoter 
from its upstream, but not its downstream, enhancer 
(Figure 3b and c, top). 

Primerr-template junctions and 3' tails at both ends of 
pGS722 are conipatible with activation of transcription at 
both promoters (Figure 3b and c, lanes 2). Addition of 
£c£)RI-Glnl 1 1 abolishes enhanced transcription from both 
promoters, regardless of whether the loading site is a 
primer-template junction (Figure 3b, lanes 3-5) or a 3' 
tail (Figure 3c, lanes 3-5). Activation of transcription on 
pGS724 is also effective at both promoters (Figure 3b and 
c, lanes 7), but addition of £coRI-Glnl II to pGS724 has 



litde effect on^fl'anced transcription, again regardless of 
whetiicr the loading site is a primer-template junction 
(Figure 3b, lanes 8-10) or a 3' tail (Figure 3c, lanes 8^ 
10). That a single EcoRl site is sufficient to block DNA 
tracking by gp45 has been shown previously (Herendeen 
et aL 1992) and is verified here by the disappearance 
of an enhancement-dependent, higher molecular weight 
transcript upon iaddition of EcoRI-Glnlll (data not 
shown). We conclude that activation of transcription firom 
a single-stranded DNA end only occurs from downstream 
of a T4 late promoter, and can be generated from priiner- 
template Junctions or 3' tails. 

In contrast, the results of Figure 1 suggest that only 
promoters with single-suranded extensions on their tran- 
scribed strands are activated (lanes 11-14, in partictidar). 
A solution to tlus apparent conflict lies in the orientation 
of die promoters of pDH82, the requirement for a clear 
and unobstmcted pathway between loading site and pro- 
moter, and the relative efficiency of a primer template 
junction as an activator of transcription. Activation of 
each of the convergent promoters of pDH82 (Figure 1) 
by a downstream loading site necessitates gp4S tracking 
past the other promoter. Open promoter complexes should 
form roadblocks to enhancement of other promoters in 
cis. In the case of pDH82 (Figure 1) bearing an efficient 
primer-template junction loading site for gp45 at one end, 
and a relatively weak 3' tail loading site at the other (•top- 
long* and *bottom-long' DNA), the promoter upstream of 
die efficient loading site is expected to open quickly (Figure 
2) and block enhancement of the promoter upstream of 
die inefficient loading site. 

Traffic problems on the track: upstream enhancers 
and downstream promoters can interfere with 
activation 

The preceding supposition has been tested directly by 
comparing open promoter complex formation on: 
pDH72AEl (Figure 4, top) with open promoter complex 
formation on a derivative template, pPH72A420, _fr^ 
which the potentially interfering downstrcain promoter 
has ^^bcen deleted. Indeed, die, yield j>f , transcripts; 
increased by deletion of the dovo^trieam.promoi^^ 
4a), at limiting or excess RNA polymera^ and is therefori^ 
unlikely simply to reflect competition for RNA polymerasei 
by the additional late promoter on pDH72AEl. 

To explain more fully die weak activation of the 
upstream promoter in Rgiue 3, we examined whether 
gp45 loaded at a non-activating >sitc^.(upsc^^ 
proximal promoter, the strca^i referring, to the dir^ 
of transcription) can interfere with actiyalion from a gp45 
loading site located downstream of .that promoter, the 
effect of accessory protein concenuation on the relative 
yield of transcripts from tandem promoters was compared, 
on DNA widi a gp45 loading site only at the downstream 
end (the upstream end having been made bltmt). and on 
DNA with gp45 loading sites at bodi ends, placing a non- 
activating gp4S loading site upstream of the left-hand 
promoter (Figure 4b). Increasing the accessory protein 
concentration increases the. relative yield of transcripts 
from die left promoter on the template with otily . a 
downstream gp45 loading, site (filled triangles), but not 
on die template with gp45 loading sites at both ends (open 
triangles). These results arc consistent widi die supposition 
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Materials and niethods. ^^""^/^^^ pn>be, 5' cad-labeled in the short strand: lanes 5-S: 3'-tailed 

shonstSd S >Oand U: with DN A polyrrierase accessory proteins alone; 1^ and 15: 

with gp32 alone; lanes 4. 8. 12 and 16: with accessory proteins and gp32. 



that allows partial invasion of the duplex region. The 
failure to protect the complementary short istrand is consistent 
with the high cooperativity of gp32 binding to single- 
stranded DNA. The length of single-stranded DNA that 
is exposed at the end of the complementary DNA strand 
would accommodate only one molecule of gp32, which 
would bind relatively inefficiently (Kowalczykowski et aL 
1981). Nevertheless, melting of this short section of the 
5'-recessed strand at the double strand-single strand 
junction could render it capable of being loaded with gp45 
in an orientation that is compatible with transcriptional 
activation. The efficiency of gp45 loading at this site is 
at least an order of majgnitude lower than at the primer- 
template junction (Figure 2a), consistent with the lack of 
a clear-cut footprint of DNA polymerase accessory proteins 
in lanes 8 and 16, 

Discussion 

A model to explain the transcriptional 
enhancement properties of diverse DNA structures 
Nicks in DNA, gaps and single-stranded extensions of 
either polarity all serve as loading sites for the DNA 



polymerase processivity factor and transcriptional activator 
gp45, but with greatly different efficiencies (Rgures 1 
and 2). Interference with transcriptional activation can 
arise in two ways: gp45 tracking along DNA in tiie 
activation-incompatible orientation diminishes the activity 
of properiy oriented gp45 (Rgiire 4b), and open promoter 
complexes can block tfie track to gp45 (Rgure.4a). 

The seemingly disparate properties of different gp45- 
loading sites and of diffcrcndy organized and oriented 
transcription units can be reconciled by a single model 
with die following features (Figure 7): (i) the loading 
enzyme. gp44/62, (Kaboord and Benkovic. 1995) recog- 
nizes the 5'-^3' polarity of the continuous DNA strand . 
at die loading site, (ii) The gp44/62-single-strandeil DNA 
interaction detemunes the polarity of loading of gp45. 

(iii) Once the orientation of gp45 on DNA is detcrnuned, 
it is hot reversed subsequenUy (Tinker et aL 1994a). 

(iv) When gp45 enters its con\plex with gp44/62 at the 
DNA loading site, it is situated on the side of - gp44/62 
that faces the 3' end of the continuous or longer DNA 
strand. At a primer-template junction, this places gp45 
over double-stranded DNA (Rgurc 7, line b), consistent 
with the prior footprinting and photocrosslinking analyses 
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orientation of gp45 loading \i Mblished by the 5'->3' 
polarity of the continuous strand, which must be the 
transcribed strand of the activated promoter. We conclude 
that gp45 is first released from its loading site at the DNA 
nick toward the left, as drawn on Figure 7. line g, because 
converting the nick into an -250 nt gap confines activation 
by gp45 to that side (line h and Figure 5). 

The known properties of the T4 DNA polymerase 
accessory proteins are consistent with this model. Single- 
stranded DNA and primer-template junctions are equally 
effective co-factors for the DNA-dependent ATPase 
activity of gp44/62 complex alone; a preference for the 
primer-template junction is conferred by gp45 (Jarvis 
et aL 1989b), A photocrosslinking analysis of the DNA 
polymerase accessory protein complex at a primer- 
template junction places gp45 on double-stranded DNA 
(Capson e( a/., 1991), consistent with the supposition that 
the specific preference of the gp44/62-gp45 complex for 
a primer-template junction is conferred by DNA duplex- 
confined gp45 stabilizing the placement of gp44/62 com- 
plex on single-stranded DNA: The cellular counterparts 
of gp45, the P subunit of £.co/i DNA polymerase III 
holoenzyme and PCNA are tori (Kong et aL 1992; 
Krishna et aL, 1994). Anticipating that gp45 will also turn 
out to be toroidal, one can rationalize readily that the 
cenual hole of gp45 might be too small for tracking along 
gp32-laden single-stranded DNA and that DNA secondary 
structure, as well as direct interactions of nucleotides in 
single-stranded DNA with the external faces of the protein 
catenane, would also obstruct tracking along bare single- 
stranded DNA. We do not know the size of the smallest 
gap that forms an effective barrier to gp45 tracking, but 
would not be surprised to find gp45 able to cross a short 
gap such as might be created at an Okazaki fragment 
junction by digestion of the RNA primer (Nossal, 1994). 
If the gp45 irimcr (Jarvis et a/., 1989a) is a PCNA-like 
torus with a 3-fold axis down its central cavity (Krishna 
et a/., 1994), then its lateral faces must be non-identical, 
capable of different specific protein-protein interactions 
and of manifesting the polarity that is;reqiiin^ 
for the properties of transcriptional enhancement at late 
promoters exhibited in this and prior -work (Herendeen 
etaL 1989; Tinker er a/w, I594a).;^^^:2^^ ' 

Efnciencies of transcriptional activation 

Extraordinarily effective transcriptional:activalion, reflected 
in very rapid promoter opening, is -^afforded by primer- 
template junctions in the presence t)f gp32 (Figure 2). 
Activation, as measured by the rate;pf>accuinulation of 
open promoter complexes, is relatively inefficient in the 
absence of any SSb (Figure 2), prpbs^ly due to non- 
productive sequestration of proteins on single-stranded 
DNA tails. This can be prevented by adding (the hetero- 
logous) SSBe. SSBs probably also prevent tracking gp45 
from falling off the ends of linear DNA. The effect is 
comparable with that of confining tracking gp45 by using 
a circular DNA template, or by blocking the ends of its 
linear DNA with tightly bound protein (cf. Stukenberg 
et ai,, 1991; Herendeen et at., 1992). , 

There is, in addition, a specific -quantitative effect 
of gp32 on the enhancer efficiency of primer-template 
junctions (Figure 2) and an absolute requirement for gp32 
that is not filled by SSBg when recessed 5' DNA ends 



serve as gp4l4Jading sites. We think it plausible to 
attribute these effects to specific interactions of gp32 with 
gp45 as well as with gp44/62 complex (Formosa et al„ 
1983). That gp32 strongly increases the affinity of gp44/62 
complex and gp45 for a primer-template Junction, as 
judged by footprinting and gel filtration (Munn and 
Alberts, 1991a; Richardson et aL 1989), is probably due 
to these interactions. Photocrosslinking experiments, have 
shown recently that specific interaction with gp32 increases 
the density of gp45 tracking on DNA, and that the general 
non-specific DNA binding activity of SSBe or the human 
SSB, RPA", cannot substitute for gp32 in this regard 
CTinVetetaL 1994b). 

Since the rate of open promoter complex formation can 
be extremely rapid, and is strongly dependent on the 
properties of the; gp4S loading site, events subsequent to 
gp45-RNA polymerase encounter must also be very r^id. 
We infer that differetices in rates of open promoter complex 
acoimdaidon primarily reflect differences in rates of gp4S 
loading (an4 possibly of imloading, which would also 
affect the density of tracking gp4S). This view is consistent 
with the observed hierarchies of transcriptional activation. 
Optimal activation of transcription is achieved with 
primer-template junctions; nicks and 3' tails are less 
efficient (Figure 2), because they are sub-optimal as 
binding sites for the gp44/62 complex (e.g. Figure 6) and 
consequently load gp4S less efficiently. Current experi- 
ments (T.-J J^u, personal communication) are directed at 
exploring the dynamics of gp45 loading and tracking. 

Iri closing, we want to point out that the most efficient 
enhancer in this resolved, highly purified and simplified 
in vitro system is not necessarily the primary contributor 
to T4 late transcription in vivo. Primer-template junctions 
are sites of assembly of DNA polymerase holoenzyme 
and of active DNA chain elongation, both expected to 
cpmpete with gp45 loading. (Preliminary, experiments 
confirm, for exaniple, that stalled DNA polymerase holo- 
enzyme blocks T4 late transcriptional enhancement by 
gp4S; p.M.iS; unpublished observations.) On the other 
harid, rs^id "promoter opening (Figure 2) allows a variety 
of gp4S loading sites to contribute incrementally to T4 
late gene activity, ,md^ n^^ also permit a single gp4S 
' trimer to activate several rounds of transcription initiation 
during a single episode of tracking along DNA. 

M^fterIa^ iihd methods 

Labeled and unlabeled inucleoside triphosphates, bovine senun albumin 
(BSA), tennit^ idraxyi^ exonuclease m - 

(exo ni)» DNA Ugase. DNase I, proteinase K« restriction enzymes and 
rifampicin were purchased fiom various commercial suppliers. 

^Plashuds pDH82, pDH310 and the pDH72 series have been described 
(Herendeen et at, 1989, 1992; Herendeen, 1991). Plasmid pGS722 is a 
derivative of pbH72^1 iti whidi the 420 nt T4 late transcription unit 
was invened by excision with AxM and NsH restriction enzymes, 
digestion of the 3' overfianging ends with T4 DNA polymerase, and re- 
ligation of the fragments. PUsmid pGS724 was generated by inserting 
the Smal-Accl fragment of pTEllO containing the 420 nt T4 late 
transcription unit, into the Sma\ site of pTElU (ElHot and Geiduschck. 
I9S4). Plasmid pDH72A420 (Figure 4a) was generated by deleting the 
150 bp Nstt-DraWX fragment of pPH72; which contains the 'right' T4 
late promoter that yields 420 m RNA, but retaining (he T4 late 
transcription unit that yields 314 nt RNA Plasmid pGS725 was generated 
by adding EcoRI linkers to the AJtOX and EcoXW sites of pDH72A£123. 
(Complete sequences of these plasmids are available from the authors 
on request) 
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The protein apparatus that functions to replicate DNA has 
been likened to an efficient machine that rapidly and accu- 
rately duplicates genetic information for the next genera- 
tion. Recent studies on both Escherichia colt and bacterio- 
phage T4 not only reinforce this Idea, but demonstrate 
that the replication machinery can monitor progress, cope 
vrfth different molecular situations, and detect problems. 
T/io Pfoffarybfe Forff 

The proteins that replicate the E. coli and bacteriophage 
T4 genomes have been Identified and characterized in 
detail (see Table 1; Komberg and Baker, 1^). It is not 
surprising that for such a fundamental process as replica- 
tion of DNA. the functions of the individual proteins have 
been consented, even In eukaryotes (see below). It Is a 
little surprising, however, that In many cases, but not all, 
there is no obvious similarity between the amino acid se- 
quences in proteins of the same functional class (Kong et 
al., 1992; O'Donneli et al.. 1 993). For example, the E. coli 
single-stranded DNA-binding protein (SSB), the phage T4 
gene 32-encoded protein (32 protein), and the human repli- 
cation protein A (RPA) are functionally similar, yet. se- 
quence unrelated. The same Is true for the p subunit of 
polymerase III (pol III) from E. coli, the 45 protein from T4. 
and the human proliferating cell nuclear antigen (PCNA). 

At the prokaryote DNA replication forte, a DNA helicase 
(DnaB or 41 protein) precedes the DNA synthetic machin- 
ery and unwinds the duplex parental DNA In cooperation 
with the SSB (see Rgure 1 A). Because of the complemen- 
tary and antiparallel nature of the DNA double helix, DNA 
replication of the two strands occurs In a fundamentally 
different manner (see Rgure 1 A). On one strand, the lead- 
ing strand, replication occurs continuously In a 5' to 3' 
direction, whereas on the other strand, the lagging strand. 
DNA replication occurs discontinuously by synthesis and 
joining of short Okazaki fragments. The leading-strand 
replication apparatus consists of a DNA polymerase (pol 
ill core or 43 protein), a 'sliding clamp" (3 or 45 protein), 
and -brace- proteins (y.66'xv [the y complex] or 44/62 
protelris). All these proteins cooperate to replicate DNA 
in a rapid and processive manner, as discussed below. 
For example, the polymerase holoenzymes can easily syn- 
thesize greater than 50,000 nucleotides at more than 500 
nt per second from a single primer without dissociating 
from the DNA template. 

The brace proteins load the sliding clamp and the DNA 
polymerase onto the primer, either the initial primer at the 
origin of DNA replication during initiation of leading-strand 
synthesis or each RNA primer for Okazaki fragment syn- 
thesis (Gapson et al., 1991 ; Munn and Alberts, 1991 ; Stu- 
kenberg et al., 1991; Gogol et al., 1992; Kong el al.. 1992). 
The brace proteins form an ATP-dependent. structure- 
specific DNA-binding protein complex that recognizes the 



primer-template junction (Rgure IB). The sliding clamp 
is tiien loaded onto the DNA adjacent to the brace proteins, 
fomiing a ring that encircles the duplex DNA behind the 
primer-template junction. A dimer of the P protein forms 
a ring around the DNA, whereas a trimer of the 45 protein 
fonns a ring that is likely to be structurally very similar to 
the p dimer. The 45 monomer Is two-thirds the mass of the 
p monomer. Indicating how proteins with different primary 
sequences, mass, and multimeric state can nevertheless 
adopt similar tertiary structures and perform similar func- 
tions. The damp stimulates the ATPase activity of the 
brace proteins. VVhen bound to the primer-template, the 
brace and damp load the DNA polymerase onto the DNA. 
Addition of all four dNTPs allows effident and processive 
DNA synthesis for many thousands of nudeotldes without 
displacement of the proteins from tiie template. 

Interestingly, the lagging strand Is synthesized by the 
same apparatus. Because the DNA Is synthesized In a 
discontinuous manner, there Is a constant need for prim- 
ersto initiate the short.(- 1000 nt long) Okazaki fragments. 
Thus, an addHionai: protein, the DNA primase (DnaG or 
61 protein). Is needed to form short RNA primers that are 
then elongated by the polymerase, aided by its bracfe and 
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• DNA polymerase 6 has been shown to function in SV40 DNA replica- 
tion; an essential DNA polymerase. DNA polymerase e, has not yet 
been assigned a specific function In DNA replication. 

• The human DNA polymerase a and primase activities function as a 
multiprotein complex to synthesize RNA-DNA primers. 
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Rflur© 1 . Proposed Gerwic Model for the Pro- 
karyote and Eukaryote DNA Replication Fork 

(A) As suggested bf B. Alberts, a dimeric DMA 
pdymerasa ooofdinately repOcates the two 
DNA strands. 

(B) Primer reoognUion by the brace and sfiding 
damp proteins. The brace proteins (y complex 
of E. ooQ. 44/62 proteins from T4 phajge, and 
RFC from eukaryotes) Wnd to the primer-lem- 
ptate function in an ATP-dependent manner. 
The sOding ctefhp (p protein from E. oolU 45 
protein from T4, and PCNA from ©uteiyotes) 
binds to the brace and DNA in a ring4ike struc- 
ture. The k)raoe-ctamp complex then attracts 
the DNA polymerase. 



damp (Stukenberg el al., 1991, 1994 (this issue of Co//]; 
Hacker and Alberts, 1994a. 1994b). Both leading- and lag- 
ging^trand DNA replication occur coordinately, facilitated 
by dimerization of the two polymerase complexes. The 
E, coli T protein mediates pol III dimerization. whereas 
the 43 polymerase probably exists as a dimer. Anotiier 
difference between these machines is that the E. coli pol 
III holoenzyme (oeO.ySS'xv.^.P) forms a stable protein 
complex without DNA. whereas the T4 holoenzyme^ 
(43.44/62.45) Is assembled on the template. ^ 

Aparadox arises when considering how these machines 
perform highly processlve and continuous DNA synthesis 
on the leading strand yet. in another context, cycle on and 
off the lagging strand during discontinuous synthesis of 
Okazaki fragments. Recent studies of the phage and bac- 
terial machines have suggested a solution to this paradox 
by demonstrating that the polymerase machines are smart 
enoug"h to detect the different modes of DNA replication 
(Hacker and Alberts. 1994a, 1994b; Stukenberg et a!., 
1994). Botti groups have suggested that the polymerase 
recognizes a key difference between leading- and lagging- 
strand DNA replication; the polymerase synthesizing a na- 
scent Okazaki fragment will encounter the 5' end of the 
previous Okazaki fragment, whereas the leading-strand 
poiymerasis Is free of such roadblocks. 



Stukenberg et al, (1 994) have studied the fate of Individ- 
ual proteins when the polymerase machine runs Into di>- 
plex DNA, taking advantage of the availability of large 
amounts of the pol III components aiid the ablTrty to label 
Individual protein components spedfically. - Design of 
these experiments was aided by knowledge of the three- 
dimensional stmctur© of the p clamp. The surface of the 
damp that Interacls vm the polymerase core was modi- 
fied by addition of a protein Idnase recognition site. By 
following ttie rateof phosphorylation of this site by excess 
protein kinase, association and dissodation of the damp 
from the polymerase core were measured. This technique 
of sHe^pedfic protein footprinting" Is one that should be- 
come popular in studies of protelr>-protein interaction. 
When the polymerase core and brace proteins (y comple^O 
were travefing along the SSB^ated dngle-slranded DNA 
template (mimicked by removing one of the four dNTPs 
and triddngttie polymerase Into betievlng It Is produ<*ively 
replicating DHA), botti wena fimily attached to the p dantp 
arid the template. When, ori the other hand, the holoerv 
zyme was moving aldng the template DNA and encoun- 
tered a DNA duplex (as In the case of a downstream Oka- 
fragment), the p damp dissodated from the other 
components and remained assodaled witii the newly syr>- 
ttieslzed DNA (Rgure 2>^. Dissodation of thfl-polymerase 
and brace from the template DNA allowed the polymerase 
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Hgure 2. Steps In the Replication of ttie tag- 
ging Strand 

See text for detaBs. Adapted from Stuckenbofg 
et al. (1994) (A) and from Waga and StUlman 
(1 994) (B), Step 7 In (B) Is hypothetical, taseti 
on E. coa and phage T4 studies. 
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to cycle. onto a different primer-template, facilitated by 
another clamp and brace attached to the new primer. 
These studies demonstrated that the holoenzyme has the 
capacity to detect a different molecular environment and 
modify its activity, depending on the nature of the template 
DNA. 

Parallel conclusions can be drawn from the studies of 
Hacker and Alberts (1994a. 1994b). Based on their thor- 
ough understanding of the phage T4 replication enzymes, 
they skillfully engineered hairpin duplexes into the DNA 
templates downstream of the primer that acted as revers- 
ible barriers to the DNA polymerase. In these studies, the 
kinetics of the dissociation of the T4 holoenzyme was mea- 
sured under two conditions. When the polymerase was 
traversing a 32 protein-coated single-stranded DNA tem- 
plate (again mimicked by omitting one dNTP), the replicat- 
ing polymerase had a half-life on the template of about 
1 50 s, enough time to synthesize more than 75,000 nucleo- 
tides (Hacker and Alberts. 1994b). In contrast, when the 
holoenzyme encountered a duplex DNA hairpin In the path 
of the moving polymerase, the half-life of the bound poly- 
merase was only 1 s (Hacker and Alberts, 1994a). Thus, 
the polymerase was smart enough to recognize the new 
environment and respond appropriately. Qrie would pre- 
sume that a doltish polymerase would bang headlong Into 
a duplex brick wall in much the same way as a steam 
locomotive would be unable to recognize a deadend track. 

The E. coli pol- III holoenzyme replicated the single- 
stranded template DNA right up to the 5' end of the down- 
stream duplex DNA located in its path, thereby providing 
a substrate for the DNA ligase to join the newly synthesized 
strand to the DNA strand that blocked progression of the 
polymerase. Nonnally, during replication of the lagging- 
strand DNA template, an RNA primer is removed either 
by an R Nase H activity or by the 5' to 3' exonudease activ- 
ity of E. coli DNA pol I. If the pol III holoenzyme can load 
onto an RNA primer and replicate the lagging-strand tem- 
plate all the way to the 5' end of the previously synthesized 
Okazaki fragment, then only one DNA polymerase mole- 
cule might be necessary for lagging^trand DNA replica- 
tion (Figure 2A). This raises the questton of how removal 
of the RNA primer is coordinated with replicalton and the 
role, if any, for DNA pol I and its 5' to 3'exonuclease activity 
in lagging-strand DNA replication. In the case of phage 
T4, the phage^ncoded RNase H removes the primer 
RNA. and one suspects that the phage-encoded polymer- 
ase is the only polymerase required to replicate T4 DNA. 
The poiymerase dissociation mechanism also raises the 
question of what detennines the length of an Okazaki frag- 
ment. One strong possibility consistent with these results 
is that the frequency of Initiation on RNA primers that are 
laid down by primase is the major determinant of Okazaki 
fragment size (Zechner et ai., 1992). 

The phage and bacterial DNA polymerase holoenzymes 
are likely to recognize additional traffic or road blocks. For 
example, the T4 holoenzyme can pass a transcribing RNA 
polymerase without dissociating itself or the RNA polymer- 
ase from the template (Liu et al.. 1993). In another context, 
the replication machinery might recognize damaged DNA 
(perhaps bound by damage recognition proteins) and stall 



or dissociate from the template until the damage is cor- 
rected by the DNA repair apparatus. If dissociation occurs, 
one attractive possibility is that the damp protein that re- 
mains assodated with the replicated DNA might Itself act 
as a signal for the repair machinery. 
The Eukaryote Fork: Similarities and Differertces 
An understanding of the protein machinery that functions 
at the replication foric during duplication of cellular DNA 
in eukaryotes has not been forthcoming owing to the lack 
of a suitable biochemical system. Studies on the replica- 
tion of simian virus 40 (SV40) DNA, however, have pro- 
vided valuable insight into the eukayotic replication ma- 
chine and have revealed many functional similarities arid 
a few differences with the prokarypte machines. 

One caveat to relying on SV40 to provide insight Into 
the cellular machinery is that the virus-encoded T antigen 
performs many functbns required for replicating the virus 
genome. First, it is an initiator protein that functtons to 
begin DNA replication at the SV40 origin; second, it is a 
DNA helicase that unwinds the DNA ahead of the polymer- 
izing machinery; third, it acts as a primosome-loading pro- 
tein (Sttllmar), 1 994). Nevertheless, SV40 must rely heavily 
on the cellular DNA replication apparatus for replication 
of its own DNA, and biochemical studies over the last 10 
years have led to the identification and characterization 
of these proteins (see Table 1). 

In eukaryotes, three DNA ploymerases (a. 6, and e) have 
been Identified with catalytic subunits having primary 
amino acid sequence similarity to each other and to the 
T4 phage 43 protein (Komberg and Baker. 1992). Unlike 
its prokaryote cousins, the eukaryotic DNA primase forms 
a permanent complex with a DNA polymerase (a). Con- 
trary to what was thought for many years, the role of the 
polymerase a-primase complex appears to be solely for 
the piirpose of providing RNA-DNA primers for Initiation of 
leading-krand synthesis and for initiation of each Okazaki 
fragment synthesis during lagging-strand replication (Rg- 
ure 2B) (Nethanel et al.. 1988; Tsurimoto et al., 1990; 
Waga and Stjllrhan^ 1994). 

PCNA aiid r6plicatk)n factor C (RFC) are essentiail sub- 
units of DNA potymarase 5 and, under certain conditions, 
also stimulate the activity of DNA polymerase e (Lee et 
al„ 1991 ; Podust and Hubscher. 1993; Waga and Stilknan. 
1994). In almost all respects, PCNA and RFC function in 
a very similar way to the P or 45 sliding damp proteins 
and the y complex or 44/62 brace proteins, respectively . 
(Rgure1B)(Tsurimotoetal., 1990; Lee etal.. 1991). There 
is a clear role for both proteins in the replication of the 
leading and tagging strands during SV40 DNA synthesis in 
vitro. For SV40 DNA replication in vitro, DNA polymerase 6 
is responsible for replication of the leadir>g strand and for 
corhpletion of the lagging strand (see Figure 28; Waga 
and Stillman, 1994). Herein lies a significant difference 
between the prokaryote and eukaryote DNA replication 
mechanisms (Rgure 2). It appears that a dimer of one 
polymerase replicates both strands at the prokaryotic fork, 
Including the entire Okazaki fragment. In contrast, for 
SV40 DNA replicatton, there is a DNA polymerase switch 
from a to 6 during initiation at the replication origin and 
for synthesis of each Okazaki fragment. Thus, polymerase 




6 seems to play a role like the pol iii ixotein. and it 
may well be that polymerase 6 also c;/-^ i/stween PCN A 
clamps on the lagging strand, duplex DNA 

downstream of the polymerase. 

Maturation of the Okazaki fragnr.*?-? ,r, ^tcaryotes re- 
quires a 5' to 3' exonuclease (FEN--, MF-^ j and RNase 
HI to remove the RNA from the V %r<} rj the Okazaki 
fragments and DNA ligase I to or/^t^tv join the DNA 
fTurchi et al.. 1994; Waga and Siiiifr^r,. 1994). If PCNA 
remains at the junction t)etween hn^Atux Okazaki frag- 
ments like the ^ damp of E. coli. tr»^. n 13 possible that 
PCNA may facilitate the activity of or more of these 
maturation proteins. 

DNA polymerase e Is essential for t^e viability of the 
yeast Saccharomyces cerevisiae, t>ui a biochemical role 
for this polymerase has not been d^rffionstrated in SV40 
DNA replication in vitro or for cellular DNA replication. It 
has t>een suggested that polymera*^ either functions at 
the cellular DNA replication fork or. alternalcvely. plays 
"a role in replication-linked DNA repair pathways that are 
essential for cell viability. Further v/orv needed to resolve 
the function of this enzyme. 
OonUol of Forfc 9AoMemeni 

A mechanism that alters the stability of ihe DNA polymer- 
ase holoenzyme on the DNA template could be useful for 
coordinating DNA replication with other cellular pro- 
cesses. I have already suggested that the holoenzyme 
may encounter DNA damage or damage-binding proteins 
and temporarily arrest while DNA r^jpair occurs. Recent 
studies have demonstrated direct control of the DNA rep- 
ncation function of PCNA by the DNA damage-induc- 
Ible, p53-regulated, cyctin-dependent kinase inhibitor p21 
(Waga et al., 1994). Perhaps the p21 protein, by binding 
to PCNA directly, tricks the polymerase 5-RFC-PCNA ho- 
loenzyme Into a state that causes rapid dissociation of the 
elongating polymerase, thereby arresting DNA replication. 
If PCNA remained bound to the template, it might facilitate 
DNA repair at that site. Consistent with this hypothesis, 
It has been shown that PCNA (and. I euspecl, RFC) partici- 
pates In nucleotide excision repair of ultraviolet-damaged 
DNA. Thus, the sliding damp may be a key link Ijetween 
DNA repncation and repair. 

Regulators such as p21 might be one niechanism to 
couple DNA metabolism to cell cycle controls, t suspect, 
however, that we are only at the t>eglnning of a real under- 
standing of the intricacies of these machines and how they 
are regulated. . 
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Summary 

The crystal structure of the p subunit (processlvlty fac- 
tor) of DNA polymerase 111 holoenzyme has been deter- 
mined at 2.5 A resolution. A dimer of the p subunft 
(Mr = 2 X 40.6 kd, 2 x 366 amino acid residues) forms 
a ring-shaped structure lined by 12 a helices that can 
encircle duplex DNA. The structure Is highly symmetri- 
cal, with each monomer containing three domains of 
identical topology. The charge distribution and orienf 
tation of the helices Indicate that the molecule func- 
tions by forming a tight clamp that can slide on DNAr 
as shown biochemically. A potential structural rela- 
tionship is suggested between the p subunit and prolif- 
erating cell nuclear antigen (PCNA, the eukaryotic 
polymerase 6 [and e] processlvlty factor), and the gene 
45 protein of the bacteriophage T4 DNA polymerase. 

Introduction 

DNA polymerases are enzymes that duplicate the informa- 
tion content of DNA by catalyzing the template-directed 
polymerization of nucleic acids. A distinction can be made 
between polymerases that are primarily involved in the 
replication of chromosomal DNA during cell division and 
those that nomially operate on shorter stretches of tem- 
plate during, for example, the repair of damaged DNA. 
Polymerases In the latter class are generally nonpro- 
cessive, l.e. , they polymerize only a few nucleotides before 
dissociating from the template (Komberg and Baker, 
1991). In contrast, the chromosomal repUcative polymer- 
ase of Escherichia coli, DNA polymerase 111 (Pollll) holoen- 
zyme, Is distinguished by its ability to perform rapid replica- 
tion (750 bases per second) of very long stretches of DNA 
without dissociation (Fay et al., 1981; Burgers and Korh- 
berg, 1982; O'Donnell and Komberg, 1985; Komberg and 
Baker, 1991). This property Is conferred upon the enzyme 
bythepresenceof associated proteins that clamp the poly- 
merase onto primed DNA, in a process that expends ATP 
energy. This mechanism, first worked out In detail for E. 
coli Pollll holoenzyme and the bacteriophage T4 DNA 
polymerase system, appears to operate analogously in 
the eukaryotic DNA polymerases 6 and e (Komberg and 
Baker, 1 991 ). The three^lmensional structure of one poly- 



merase has been determined by X-ray crystallography, 
that of the Kienow fragment of E. coli Poll (bills et al., 
1 985). Although this Is not a highly processive polymerase . 
the structure has general relevance for understanding the 
mechanism of the enzymatic subunits of DNA polymer- 
ases. No stnictural information has yet been available, 
however, for any of the accessory proteins, of the pro- 
cessive polymerases. 

Intact Pollll holoenzyme Is a complex of at least 1 0 differ- 
ent protein subunits (a, e. 9, t, y, 6, 6', x. V. and p) (MaW 
and Komberg, 1 988). The a subunit perfomns the catalytic 
polymerase function, and the e subunit Is the 3'-5' exo- 
nudease. A three-subuntt core polymerase subassembly 
of the holoenzyme, containing a. e, and G, Is unable to 
act processively on Its own, alti:iough It can fill In short 
single-stranded regions. The highly processh^e character 
of the holoenzyme can be reconstituted upon mixing the 
core polymerase with both the'P subunit and the five- 
protein y complex (y. 6, 6', x. and v) (Wickner. 1976; 
QDonnell. 1987; Maki and Kornberg, 1988). The reconsti- 
tution of the processive polymerase proceeds In two dis- 
tinct stages, in the first stage, the y complex hydrolyzes 
ATP to transfer the p subunit to the primed template. In 
the second stage, the core polymerase assembles with 
the P subunit on DNA to fonn the processive polymerase. 
Thus, It Is the p subunit that confers the remarkable pro- 
cesslvlty onto the core polymerase. Study of the minimal 
number of subunits required to assemble the processh^e 
polymerase showed that only the y and 6 subunits of the 
y complex are needed to transfer p froni solution to the 
primed template (O'Donnell and Studwell, 1 990). Two sul>- 
untts of the core polymerase, a and e (as an ae cornplex), 
are needed for processive polymerization (Studwell and 
ODonnell, 1990). Once on DNA. the p subunit confers 
complete processlvlty onto the ae polymerase, Bven upon 
subsequent rerrioval of the y complex (Stukenberg et al.. 
1991). ' , ^ 

Once the y complex has performed the' operation of 
clamping the P subunit onto DNA, the P subunit Is very 
strongly bound In that it cannot be easily separated from 
drcular DNA. It has, however, been shown to slide freely 
along duplex DNA. consistent with Its role as'^a clamp that 
tethers the polymerase core to the template and moves 
.along with the polymerase during replication (Stukenberg 
kal., 1991). Experiments using restriction endonucleases 
have revealed that If circular DNA Is cut after the p subunit 
Is clamped on, the p subunit completely separates from 
DNA by sliding to the site of the break and f sailng off (Stu- 
kenberg et al., 1991). These and related experiments In 
the same study suggest that, in contrast to site-spedfic 
DNA-blnding proteins such as transcription factors or 
nucleases, which make specific hydrogen-bonding or 
other stabilizing interactions with DNA, the p subunit is 
bound to DNA mainly by virtue of its topology rather than by 
stabilizing interactions: It was proposed that the p subunit 
might form a closed ring, and that one role of the y complex 
might be to open and then close the ring around DNA, 
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Rgure 1. Ribbon Representation of the Poly- 
peptide Chain of a p 3ubun!t Dtmer. Looking 
Down the 2-Fotd Axis of the Ring 
The a helices are shown as spirals and the 0 
sheets as flat ribbons. The two monomers are 
colored yellow and red. A standard model of 
B-form ONA (Saenger. 1984) Is in the middle 
of the structure, represented In stick form with 
phosphorus, oxygen, nitrogen, and carbon 
atoms colored yellow, red, blue, and green, re- 
spectively. The DNA structure is hypothetical, 
and Is placed In the geometric ceriter of the p 
subunit ring with the helix axis aligned along 
the 2-fold rotation axis of the ring. Figures 1 , 2. 
3. 6, and 8 were generated u^g the program 
QUANTA (Polygen Corp.). Refined atomic co- 
ordinates and X-ray stnicture factors are t>eing 
deposited in the Prot^ Databanic 
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effectively trapping the p subunit on DNA (Stukenberg et 
aL. 1991). 

: As a step toward a detailed understanding of the molecu- 
lar basis of the processive properties of Pollll holoenzyme. 
we have crystalized the p subunit and determined its 
three-dimensional structure by X-ray diffraction at 2.5 A 
resolution. In pleasing congruence with previous specula- 
tion, we find that the structure is indeed that of a<Jlosed 
fmg, with overall shape similar to that of a donut or toroid. 
In this paper we discuss the consequences of this structure 
for the function of the p subunit, and suggest a possible 
structural relationship between this protein and jts func- 
tional equivalent In the eukaryotic polymerase 5 (and e) 
replicases, the processMty factor PCNA (proliferating cell 
nuclear antigen), and the bacteriophage T4 DNA polymer- 
ase gene 45 protein. TJiis represents the first atomic reso- 
lution view of any of the accessory pVoteins of processive 
polymerases Involved In DNA replication, and comple- 
ments the previously reported stnjcture of the polymerase 
I Klenow fragment (OIlis etal., 1985). 

Results and Discussion 

Architecture of the P Subunit Dlmer 
The P subunit forms a head-to-tail dimer in the crystal, 
consistent with previous observations that the isolated pro- 
tein Is a dimer In solution (Johanson and McHenry. 1980). 
Representations of the polypeptide backbone of the dimer 
are shown In Figures 1 and 2, and a space-filling represen- 
tation of all atoms Is in Figure 3., The overall structure is 
that of a star-shaped ring, of approximate diameter 80 A, 
with a hole of diameter ^35 A In the middle (Rgure 1). The 
2-fold dimer axis is perpendicular to the face of the ring, 
the thickness of which is about that of one full turn of duplex 
B-form DNA (^34 A). 

The starlike shape of the ring is due to an unexpected 
feature of the structure: It is internally symmetric, with each 
monomer of the p subunit corisisting of three structural 
domains of identical chain topology and very sifnilar three- 
dimensional structure. Each domain is roughly 2-fold sym- 
metric in its architecture, with an outer layerof two p sheets 
providing a scaffold that supports two a helices. Replica- 
tion of this motif around a circle results in a rigid molecule 
with 12 a helices lining the Inner surface of the ring, and 
with 6 ^ sheets forming the outer surface (Rgure 1). Sur- 
prisingly, this structure Is reminiscent of the ring-shaped 
pentameric assembly of the B subunits of cholera toxin- 
related heat-labile enteroloxin from E. coli (Slxma et al.. 
1991). The toxin structure has 5 a helices that line the 
inner surface of the ring with antiparallel p sheets forming 
the outer surface. The hole In the ring Is plugged by an 
extended polypeptide strand from the A subunit. The hole 



Figure 3. Spacefllling Model of the P Subunit Dimer with B-Form 
DNA 

One monomer is colored red and the other yeliow. The radius of the 
spheres corresponds to the van der Waals radius of the corresponding 
atom. Hydrogen atoms are not explicitly displayed, but manifest them- 
selves as Increased radii for atoms that they are bonded to. The hypo- 
thetical model of B-form DNA Is as in Rgures 1 and 6. and is shown 
with one strand colored white and the other green. The double helix 
passes through the hole in the p subunit dimer with no steric repulsions. 



in the toxin slmcture (diameter 1 1 A) is much smaller than 
that obsen^ed in the p subunit dimer. and the detailed 
topology of each of the B subunits of the toxin Is different 
from that of the p subunit modules. 

The secondary structure and chain topology of a domain 
is particularly simple and is shared by ail three domains 
in one monomer two adjacent antiparallel a ^helices are 
flanked by two 4-stranded antiparallel p sheets (see sche- 
matic diagram in Rgure 4A). One of the p sheets forms the 
commonly found "greek key" motif (Brandon -and Tooze, 
1991). The other p sheet contains the N- and C-termini of 
the chain. If an Imaginary connection Is drawn between 
the termini, this p sheet also forms a greek key. and a 
striking 2-fold symmetry in the chain topology Is revealed 
(Rgure 4A). Although there is. insufficient seguence simi- 
larity to draw conclusions about the evolution of this fold, 
we note that one domain can be generated by duplication 
of a pappp motif (Rgure 4A). The chain topology diagram 
also reveals the simple principle underiying the architec- 
ture of the entire ring. The two outer strands of p sheets 
in one domain form hydrogen-bonding Interactions with 
con*esponding strands in two adjacent domains, contin- 



Figure 2. 0. ConnecUvity of the p Subunit Dimer 

The stereo diagrams are colored based on the sequence number of the residues, with the colors smoothly vanring In the order greerj. hght blue. 
^^e Z ^^low with increasing sequence number wtthln each monomer, from the Memtinus to the CMerminus. The d.mer Interfaces are 
Ze;nm;^m^^^ 

^me chl SrarTmarked with N and C. The domains are numbered 1. 2. 3 and V. 3' in the two monomers, (B) Edge view of the ring. 
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Rgure 4. Secondary and Tertiary Structure of the p Subunit 

The secondary structure elements were defined using the program 

OSSP (Kabsch and Sander, 1983). ^^^osn nf 

(A) schematic diagram of the ^econdanr «mctu« ofone doma.r.^ 

me 6 subunit. a helices are shovm as rectangles and p sheete as 

Si The topology is 2.oldsymmetrica6outtt,e^^^^^ 

the ellipse. The stiucture can be generated by duplication of a pappp 

KSn1llS>r>es«e; 1988) of a domain. The app^-clmaJe 
SolTiLeS axis indicated by the anow. The secondary struc 

SSSS^C^ndartes are shaded gray. These 
SI-^BBhaetsIn the dlmer.Theothertvwaretheeorrespondtng ones 

KS,£r,!Se and tvyo omers are formed by^n-nu,^ 
^eets (shown unshaded) across the molecular boundaries. Note the 
S^nSrrprotrudingkx>psonthetophalfofthemoloculo.Th^ 
^n^sU^3elementsarolabeleda8ln(A).«tthtteu^^ 
^lllSeS^^and doubly primed labels referring to the first, second. 

and third domain, respectively. 



ued around the circle (Rgure 4C). No distinctions are ap- 
parent between such p sheet extensions across Internal 
domain boundaries as opposed to intermolecular con- 
tacts \ e the two dimer interfaces also form continuous 
antiparallel p sheets. These interactions lead to a com- 



pletely closed circle with six "seamless" p sheets on the 
outer surface (Figures 1 . 3. and 4). 

Each domain consists of about 1 1 0 residues, and forms 
a compact and weli-folded structure (Rgure 4B). The 2.fold 
symmetry apparent in the topology diagram manifests it- 
self as a very approximate 2-fold axis between the two 
helices (Rgure 4B). Despite the simple architecture, each 
domain is quite clearly an independent folding unit, with 
a well-defined hydrophobic core consisting of about 20 
residues. The symmetry axis relating the two molecules in 
the dimer is noncrystailographic, i.e., the two monomers 
are packed Into different crystal environments and are 
crvstallographicaliy Independent. The transformation that 
optimally superimposes the two molecules is a 180*> rota- 
tion about an axis perpendicular to the plane of the ring, 
and results in a root mean square (rms) deviaUon of 0.40 
A between equivalent C. positions (not Including residues 
in the loops formed by residues 1^28 and 210-213. and 
the last three residues at the Otermlnus). This axis Is in- 
clined by ^2<> with respect to the b4xls of the P2, crystal 
form, and therefore the holes of translationally related dl- 
mers do not line up to generate a linear tunnel In the crys- 
tal. Rotations about the dimer axis also superimpose the 
three different domains that constitute each monomer, 
and this basic structural unit repeats after every 60^ rota- 
tion about the axis. Although the amino acid sequences of 
each domain are quite different, supertmposition of the C. 
positions of the three domains reveals that approximately 
80% of these positions can be considered to be structur- 
ally analogous (Rgure 5). The rms deviation in the posi- 
tions of these Ca positions Is 1.2 A. 

Potential Mode of Interaction with DNA 
In the replication of a chromosome, the initial clamping of 
a p subunit dimer on DNA occurs at a primer terminus 
which is RHAIDHA hybrid duplex, presumably A-form 
(Saenger, 1984; Komberg and Baker, 1991). Subsequent 
to this event, the p subunit can freely move along duplex 
DNA presumably B-form (Stukenberg et al., 1991). The 
interactions of the p subunit with both the A and B fomns 
of the double helix are therefore of Interest. A-fomt DNA 
is similar to the B iom In terms of the width of hs cross 
section, and the local direction of the phosphate backbone 
with respect to the helix axis Is very similar In both fornis 
(Saenger, 1984). All subsequent analysis is focused on 
just these two gross features of the double helbc, and the 
results are applicable to both the A and B tonus. 

Although the slnicture has been detennlned In the ab- 
sence of DNA. several obvious features Indicaito that the 
protein Is designed to wrap around the double helbc with 
a minimum of locally specific interactions. The high sym- 
metry of the stmcture is well suited to interact with the 
cylindrically symmetric DNA duplex, and the hole In the 
middle of the ring (of diameter ^25 A, not Including ex- 
tended sidechalns) Is large enough to easily accommo- 
date either the A or B fonns of DNA (diameter ^^^25 A) with 
no steric repulsion. Insertion of a model of either form of 
DNA into the ring results in a precise relationship between 
the common tilt in the orientations of all 12 a helices and 
the tilt of the phosphate backbone (see below). Rnally. 
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Fiaure 5. Slruetural Similarity between Three Oomairts of the Monomer ^ ^ , ,^ ^ 

^ „ -« -o,.K -i^nnmor i« disniaved In a stereo diagram after the three are optimaBy superimposed. The domains were considered 

in pairs, and a "™Son^^^^ " 8 ^ repoaSng the least-squares optlmlzaUon. The 

me helix axis. This does not signincantly alter its orier^talion with respecl lo the proposed model of DNA. 



although the protein is strongly negatively charged, calcu- 
lation of the electrostatic field generated by the molecule 
reveals a focusing of positive electrostatic field in the cen- 
ter of the ring, precisely where the negatively charged 
phosphate backt>one of DNA is expected to be. 

One consequence of the symmetrical arrangement of 
the six domains Is that each of the 12 a helices has a 
similar tilt with respect to the axis of the ring. Assuming 
that duplex DNA passes through the middle of the ring, we 
generated a model of standard B-form DNA and placed It 
in the center of the p subunit (Figures 1, 3. and 6). The 
simple assumption that the duplex is perpendicular to the 
plane of the ring results in an intriguing relationship be- 
tween the axes of all 12 helices and the phosphate back- 
bone lining the major and minor grooves of DNA (Rgure 
6). The axis of each helix Is almost precisely perpendicular 
to the local direction of the phosphate backbone, Le., the 
helices span the major and minor grooves. This feature 
seems designed to prevent entry of the protein Into either 
groove, and should facilitate rapid motion along the du- 
plex. An additional consequence of aligning the DNA per- 
pendicular to the ring is that each helix interacts with a 
different phase of DNA: If one helix spans the major 
groove, the one directly across the ring spans the minor 
groove. This Is likely to lead to a damping out of the varia- 
tion in Interaction energy with the phosphate backbone as 
the protein moves across the grooves of DNA, The a heli- 
ces are maintained In this precise orientation due to pack- 
ing interactions with each other and with the underiying p 
sheet. Despite the Intrinsic curvature of the p sheets, the 
strands, by and large, run In directions parallel to the heli- 
ces, and therefore perpendicular, to the DNA backbone In 
this model (Rgures 2 and 4). 
When viewed down the double helix, the phosphate 



backbone of DNA is 10-fold and 1 1-fold symmetric In pro- 
jection for the A and B forms of DNA. respectively (such 
a projection Is shown for the B form In Rgure 1). Thus, 
there is no specific correspondence between the number 
of helices (12) that line the hole of P subunit and repetitive 
periods in DNA (10 or 11). Rather, the important features 
of the structure are the circular symmetry and the conser- 
vation of the helix-DNA backbone Interaction. Given the 
dimensions of the ring and the tilt of the helices, the 12 
helices pack against each other with no additional space 
remaining. 

A dimer of p subunit contains 38 aspartate, 58 gluta- 
msite, 24 lysine. 50 arglnlne, and 14 histidlne residues. 
The protein thus has a net charge of -22 If all hlstidines 
are assumed neutral, and -15 If half of them are charged. 
This negative charge Is consistent with the Inability of the 
p subunit to bind DNA without ATP activation by the y 
complex. The electrostatic charge is not, however, unl- 
fonnly distributed over the protein. Wo have f oi^od It useful 
to visualize the effects of the asymmetric charge distribu- 
tk)n by calculating the electrostatic field generated by the 
protein, using a continuum electrostatic model that treats 
the protein as a tow dielectric medium with embedded 
charges, immersed in a high dielectric solvent (water) oT 
variable tonic strength (Gllson et al., 1 988). WhUe a quanti- 
tative analysis of charge effects would require a careful 
consideration of approximations Introduced in this model, 
we focus solely on qualitative features that are reproduct- 
Wy obtained with various calculational parameters (see 
legend to Rgure 7). 

Two Important qualitative features consistently emerge 
from analysis of the computed electrostatic field. The outer 
edge and both faces of the ring are In regions of strongly 
negative electrostatic potential (Rgures 7A and 7B). How- 
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Figure 6. The HeOcos of the p SubunH Are Perpendicular to the Phos- 
phate Backl)one of the DNA 

The 12 a helices of the P subunit dimer and a hypothetical model of 
Worm DNA are shown. The relative orientation of the double helix and 
the p subuntt is exactiy as in Rgures 1 and 3. For clarity, the a helices 
are shown schemaUcaily In a ribbon representation, with the brighter 
ones being doser to the viewer. The central helix In front of the DNA 
is perpendicular to the phosphate backbone and spans the maior 
groove of the double helix, while the one furthest behind It spans 
the minor groove. Each of the 12 helices has similar disposition with 
respect to the DNA backbone, biit faces a different combination of 
the major and minor grooves. In this projection, the major and minor 
grooves are superimposed, with the major groove In the center of the 
figure being nearer the viewer. 



ever, the surface of the hole has strongly positive electro- 
static potential, and would be expected to interact favor- 
ably with the negatively charged backbone of DNA. Thus, 
the p subunit dimer can be described as a ring of negative 
charge surrounding a positively charged core (Figure 7). 
This focusing of positive charge is probably necessary to 
stabilize the dimer around DNA. as the dimer interface is 
Itself mainly electrostatic In nature (see discussion below) 
and Is unlikely to withstand repulsive interactions, with 
DNA. 

The other Important feature is that the two faces of the 
ring are quite dlsslmnar In their properties, due to the 
head-to<ail dimer formation (Rgure7B). Although both are 
negatively charged, tiie negative electrostatic potential is 
dearty more extended on one face tiian the otiier. The 
otfier face has six promlnent loopatiiat extend away from 
the ring and appear to be well suited for Interactions with 
anottier protein subunit (Rgure 4C). We presume that only 
one off titese faces Interacts with ti^e as polyme rase, which 
is known to bind strongly to the p subunit (Stukenberg et 
al., 1991). Altiiough ttie faces of ttie p subunit are asym- 
metric, tiie symmetry of duplex DNA makes it u nlikely that 
one orientation will be preferred over the other. It is likely 
tiiat tiie Y complex Interacts with the primer template junc- 
tion In a specific orientation, and may thereby correctly 
orient the p subunit witti respect to the primer terminus for 
productive interaction with the ae polymerase. 



Figure 7. Electrostatic Potential Maps for the 
p Subunit Dimer and an Isofated Monomer 
The maps are calculated using the programs 
DELPHI (Gilson et al., 1988) and INSIGHTIl 
(Btosym Technotogies). Lys and Arg residues 
have a single positive charge localized on the 
terminal nitrogen atoms of the sidechains. Asp 
and Glu residues have a single negative 
charge, localized on the terminal oxygen aton^ 
of the sidechains. .His sidechains have a 1/2 
positive charge each. All other atoms In the 
molecule are considered neutral. Qualitatively 
slrnQar results are obtained upon changing the 
charges on the protein to make all hlstidines 
neutral. The calculation was done assuming a 
uniform dielectric of 80 for the &6{yent and 2 for 
the protein Interior. The tonic strength was set 
to zero. The red and blue mesh contours repre- 
sent negative electrostatic potential (energy of 
-2.5 ktT/e) and positive electrostatic field {en- 
ergy of +2.5 kiT/e), respectively. koT is the 
product of the BdtzmaAn constant and the tem- 
perature, and e Is tiie charge of the electroa. 
Two orthogonal views of the electrostatic po- 
tential of the dimer are shown In (A) and (B). and 
(O is the potential for an Isolated monomer. 
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Figures. Stereo Diagram of the Dimer Interface 

Bonds llnklnfl C5. atoms In one monomer are yellow, and those In the other are green. Note the continuation of the p sheet across Uie interface. 
T^n «nIrio add sldX^ are shown at the Interface, with the dot surfaces representing the van der Waals spheres of the «oms. Four of the^ 
LehSpl^btefMldues and are colored white (Phe-1 06 an^ 

^Ly^TA^^^^oi by one monomer and are colored blue. GIu<JOO. Glu«)3. and GIu^M are frem the other monomer and are colored 

red. For darrty, Arg-96 and GIu-303 are not shown. 



The spacing between the hypothetical phosphate back- 
bone of DNA and the protein sidechains is such that direct 
contacts are not lil<ely to be made. The distance of closest 
approach between the terminal atoms of fully extended 
arginine residues lining the hole and the phosphate groups 
is expected to be no less than 3.5 A. This suggests that 
water molecules will play an important role in mediating 
the protein-DNA interactions, which will also increase the 
ability of the protein to move along DNA without becoming 
attracted to any particular region. This lack of specificity 
makes it unlikely that useful crystals of a p subunit-DNA 
complex would ever be obtained. Molecular dynamics si m- 
ulations of a solvated model of p subunit and DNA may be 
a useful approach toward understanding the details of this 
interaction. 

The DItner Interface and the Formation of the 
Head-to-Tall Dimer 

The nature of the dimer interface is relevant to an under- 
standing of the role played by the y complex and ATP in. 
assembling the p subunit on duplex DNA. The main feature 
of the interface is a continuation, across the molecular 
boundary, of p sheet structure (Rgures 2, 8, and 10). This 
appears Indistinguishable from p sheet continuation at in- 
terdomain boundaries within a monomer, and contributes 
at least four strong hydrogen bonds at each of the two 
interfaces. Relatively little surface area is buried upon for- 
mation of the p subunit dimer. The exposed surface area 
calculated using a water-sized probe of radius 1.5 A (Lee 
and Richards, 1971) is 33,218 A* for two Isolated mono- 
mers, and decreases by only 8% (to 30,525 A^) upon di- 
merization. 

In addition to the hydrogen bonds contributed by the p 
sheet, further stabilization at the dimer interface Is pro- 
vided by two distinct sets of interactions between amino 



acid sidechains from neighboring domains (Figure 8). At 
the center of the interface, the sidechains of Phe-108 and 
lle-278 from one monomer pack against lle-272 and Leu- 
273 from the other and form a small hydrophobic core. 
Surrounding these residues are six potential intermolecu- 
lar ion pairs: Lys-74 and Glu-298' (closest observed dis- 
tance between charged groups of 2.9 A, with the prime 
indicating a residue in the other monomer). Lys-74 and 
GIu-300' (2.5 A). Arg-96 and Glu-SOa (4.7 A), Arg-1 03 and - 
GIu-3O4'(3.0 A), Arg-1 05 and Glu-30V(3.3 A), and Arg-1 05 
and GIu-303' (3.0 A). Two of the ion pairs (Arg-96-Glu-300' 
and Arg-1 03-GIU-304') involve charged groups that are 
both inaccessible to solvent, as determined using a water- 
shed probe (Lee and Richards, 1971). which is expected 
to lead to particularly strong ionic interactipns. A feature 
of these interactions is that all the positively charged resi- 
dues are contributed by one monomer, with the other one 
contributing the negatively charged residues. The com- 
puted electrostatic potential for an isolated monomer re- 
flects this charge asymmetry, with the N- andi G-terminal 
parts of the monomer being In regions of positive and neg- 
ative electrostatic potential, respectively (Rgure 7C). This 
extensive electrostatic complementarity Is unique to the 
dimer interface, and Is not obsewed at the interdomain 
boundaries within a monomer. 

The dimer Interface Is thus seen to have a number of 
specific and potentially strong interactions, and a plausible 
explanation for the ATP requirement for P subunit assem- 
bly on DNA is that energy is required to break the ihte rfacial 
hydrogen bonds and buried ion pairs. Nevertheless, the 
relatively small Interaction surface suggests that a mono- 
mer-dimer equilibrium is possible in solution. In this re- 
gard. It Is Interesting to contrast the observed head-to-tail 
dimer (with the N-tenninal domain of one monomer inter- 
acting with the C-terminal domain of the other) with a 
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Rgure 9 Alignment of the Sequences of the Domains of the p Subunit with Human PCNA. Yeast PCNA. and Gene 45 Protein 
ThethreedomaInsofthePsubunitarelabeledBETA.1.BCTA.2,andBETA:3.anda^^ 

iTonr^Se hum^^^^^^ and gene 45 sequences have been split Into two domains labeled 1 and 2. The secondaiv structural f -^^^^ 
rsuburrm^ns are bixed L labeled. The meanings of the shaded bars for the p subunit sequences are d'«erentfrom those or the PCNA 
Lfd geneT^enL. Within the p subunit. the bars indicate all the residues that are completely buried, as udged by "t access^h^ 
^culaL7us^^^^ 
^tJ^ese^^^^^ 

1 buri^H^^ of the p suLit demons. The residues Asp, Glu. Gin. Asn, His. Arg. and Lys have been excluded from the gray bars 
^PC^^Vw^dgen^^ 

iJhe^ at^^^ ^^.e PCNA and gene 45 domains have sequence identity with amino adds in the p subunit domains. The numbers to the nght 
are the last residue numbers In each row. 



head-to-head assembly. The latter would have identical 
faces on both sides of the ring and is not forbidden from 
considerations of chain topology alone. Model building 
shows that by rotating one of the monomers, a structure 
can be constructed which maintains similar p sheet and a 
helix packing at the Interface, but In which the N- and 
C-terminal domains Interact with the corresponding do- 
mains from the other monomer (this would correspond to 
rotating one of the monomers by 1 80* about a vertical axis 
In Rgure 2B). Such an arrangement would, however, lead 
to unfavorable electrostatic interactions, as the N- and 
C-termlnal regions of the mbnomers are positively and 
negatively charged, respectively (Rgure 7C). Electrostatic 
complementarity Is therefore liicely to play an important 
role in favoring an Initial head-to-tail association that could 
then lead to the formation of the complete Interface and 
the burial of Ion pairs;' 

Sequence Comparison between p Subunit Domains, 
PCNA, and Gene 45 Protein 

The Internal symmetry and domain structure of the p sub- 
unit was not suspected eariier because the three domains 
share very little sequence Identity. Pairwise sequence 
alignments, based on the three-dimensional structure, re- 
sult in only 1 6%, 9%, and 9% amino acid Identity between 
domains 1 and 2. 2 and 3, and 1 and 3. respectively (Rgure 
9). This Is well below the threshold of 20%-25% Identity 
required for the predication of similar three-dimensional 



structure (Sander and Schneider. 1991). Likewise, con- 
ventional sequence-matching algorithms fail to detect any 
relationship between the sequences of the three domains. 
However, knowledge of the three-dimensional structure 
makes possible an unambiguous intemal sequence align- 
ment, based simply on structural correspondence, and 
this reveals two conserved features (Rgure 9). Rrst. there 
are 17 positions in each domain that are completely buried 
and where only Phe. Leu, lie. Met. Vai, or Ala Is present. 
These constitute the conserved hydrophobic cores of the 
domains. There are another 10 or so buried positions 
where charged residues are not tolerated (with the excep- 
tion of residues involved in dimer formation; ^^e below). 
Second, although the protein Is overall negative in charge, 
the two a helices In each domain have net positive charge 
(the 1 2 helices have a total of 22 Arg and Lys residues and 
8 Glu and Asp residues). 

The three domains presumably diverged from an ances- 
tral module by gene duplication events. That the present 
level of sequence Identity Is so low probably reflects the 
fact that the structural constraints on the domain are fairly 
loose compared to the geometric requirements at, for ex- 
ample, an enzyme active site or the recognition site of a 
site-specific transcription factor. Even in systems that do 
have such geometric requirements, it appears that large 
numbers of different sequences can lead to the same func- 
tional fold as demonstrated by random mutagenesis ex- 
periments on X repressor (Lim and Sauer. 1991). The di- 
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vergence in sequence between the three domains of the 
p subunit is not surprising in this context. 

Given the lack of significant sequence similarity be- 
tween the three domains of the p subunit, it seems unlikely 
that conventional sequence-matching schemes will be 
successful in recognizing protein sequences that adopt 
this three-dimensional fold. We have, instead, focused our 
attention on proteins that are known to be functionally simi- 
lar to the p subunit, and looked to see If they are likely to 
have the same overall architecture. These proteins are the 
various "processivity factors" of other replicative polymer- 
ases (Kornberg and Baker. 1991), the best characterized 
of which are the bacteriophage T4 DNA polymerase sys- 
tem (Nossal and Alberts. 1984) and the yeast and mam- 
malian chromosomal replication polymerase. Pol 6 Tsun- 
moto and Stillman, 1990; Burgers, 1991 ; Lee et al., 1 991 a. 
1991b). For highly processive DNA replication, each of 
these polymerases requires a set of ATP-dependent ac- 
cessory factors that appear to act to tether the core poly- 
merase enzyme onto DNA. in a manner analogous to the 
y complex and p subunit of E. coli Pollll holoenzyme. The 
accessory factor that has functional correspondence with 
the P subunit is the gene 45 protein for the T4 DNA poly- 
merase system, and PCNA for eukaryotic Pol 5 (Kornberg 
and Baker, 1991). T4 gene 45 protein and PCNA are ap- 
proximately two-thirds the size of the p subunit (T4 gene 45 
protein: 227 residues; Saccharomyces cerevisiae PCNA: 
258 residues; human PCNA: 261 residues). Given their 
smaller size and knowing the three-domain stmcture of 
the p subunit. we reasoned that these proteins may be 
composed of two domains each, and thereby function as 
trimeric molecules that assemble six domains around DNA 
in sets of two per monomer. There is no obvious amino 
acid sequence homology between the p subunit and T4 
gene 45 protein or any PCNA. However, measurements 
of the molecular weight of T4gene 45 protein and S. cere- 
visiae PCNA in solution indicate that they are likely to be 
trimeric (Jarvis et al., 1989; Bauer and Burgers, 1988). 

The sequences of T4 gene 45 protein and of yeast and 
human PCNA were compared with those of the three do- 
mains of the p subunit. using the following criteria. Inser- 
tions and deletions were allowed only between the sec- 
onidary structural elements of the p subunit, and the hy- 
drophobic core and other buried residues of p were 
matched with hydrophobic or neutral polar residues in the 
other sequences. The yeast and human sequences are 
closely related (30% Identity) (Almendral et al.. 1987; 
Bauer and Burgers,. 1990), and these conditions led to a 
plausible, albeit weak, alignment for both PCNA se- 
quences with the p subunit (Figure 9). A similar alignment 
is obtained for gene 45 protein with the p subunit (Figure 
9). This alignment differs from one previously reported 
between yeast PCNA and T4 gene 45 protein, which in- 
cluded several very long insertions and deletions (Tsuri- 
moto and Stillman, 1990). 

The resulting alignment between the p subunit domains 
and other sequences (two domains each) is shown in 
Figure 9, which also Indicates the buried residues in the 
p subunit. The alignment presences In PCNA the hydro- 
phobic core of the p subunit, and also reveals a suggestive 



level of sequence identity between PCNA and the domains 
of the p subunit. At 37 positions out of 110, there are at 
least two amino acid identities between the domains of the 
P subunit and PCNA (Figure 9). Some of these sequence 
identities are intriguing. For example, sheet 4 in the third 
domain of the p subunit has two buried Glu residues that 
are involved in dimer formation. These are conserved in 
corresponding positions in the PCNA sequences. Two of 
the positively charged residues Involved In ion pairing at 
the dimer interface (Lys-74 and Arg-96) have correspond- 
ing Lys or Arg residues at the same position or one amino 
acid removed in the PCNA sequences. Eight positions in 
the domains of the p subunit have at least one buried 
aromatic residue. Rve of these have conrespondlng aro- 
matic residues in one of the PCNA sequences. Finally, 
although both PCNA sequences have a net excess of neg- 
atively charged residues, the regions corresponding to the 
helices have net positive charge, jqualltatively similar to 
what is observed in the p subunit. Normalized to 12 a 
helices, the yeast sequences have 1 8 positive and 1 2 neg- 
ative residues on the helices (the human sequence has an 
additional negative charge). Similar conclusions can be 
drawn from the alignment of gene 45 protein with the p 
subunit (Rgure 9). As in the p subunit and PCNA, although 
the overall charge of the gene 45 protein is negative, the 
net charge for the a helices Is positive (15 positively 
charged and 9 negatively charged when normalized to 12 
helices). Ion pairs can also be formed in the trimer interface 
of the gene 45 protein. The negative charged Asp-49 and 
Asp-51 (which are aligned one residue away from the bur- 
ied Glu residues) can be paired with the positively charged 
Lys-177 and -184, or alternatively, the positively charged 
Arg-127 and -130 can be paired with Asp-69, Glu-71 , and 
Glu-76. 

This alignment suggests that if PCNA and gene 45 form 
a ring around DNA, then they must do so as trimers, with 
each monomer of the protein contributing two domains 
that are roughly similar In architecture to the p subunit 
domains. However, it must be stressed that the alignment 
Is weak and should only be taken as a hypothesis for fur- 
ther experimental testing. For example, the recently devel- 
oped profile method of sequence comparison (Bowie et a!.. 
1 991), vrtilch relies on knowledge of the three-dimensional 
stmcture of one of the stmctures being compared, is un- 
able to detect any significant similarity betweein p subunit 
domains and PCNA or gene 45 protein when provided with 
the p subunit stmcture. It Is, however, able to detect some, 
but not all, of the internal symmetry of the p subunit A 
conclusive understanding of PCNA and gene 45 protein 
architecture must await detemnlnation of the three- 
dimensional structures. 

Conclusion 

A satisfying feature of the structure of the p subunit is the 
beautiful way In which the circular symmetry of the double 
helix has been reflected in the ring-shaped and highly sym- 
metric structure of the protein clamp. This symmetry mani- 
fests itself on a smaller scale than the domain t>oundaries. 
and the entire structure can be considered to be generated 
by replication of a simple PaPPP motif 12 times around a 
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Circle., with the p strands and a helices being oriented 
perpendicular to the direction of the grooves of DNA. This 
simple and elegant structure carries out a simple function 
and stands in sharp contrast to the structure of the actual 
catalytic subunit of the polymerase, as probably exempli- 
fied by the Klenow fragment of Poll. That enzyme, which 
must recognize the information content of DNA and then 
replicate one strand, has none of the symmetry of DNA 
and resembles rather the palm of a hand gripping the dou- 
ble helix (OIlis et al.. 1985). For processive polymerases 
involved in DNA replication, it appears that the proper com- 
bination of symmetry and asymmetry is a requirement for 
functionality. 



Experimental Procedures 
Crystallzatlon 

The P subunit was puriHed to >99% purity as descril>ed (Onrust et 
al.. 1991). wttti the following modifications. Chromatography on AH- 
Sepharose was performed in place of SP-sephadex, and the chroma- 
tofocusing step was replaced by chromatography on a fast flow Q 
column and a monoO column {Pharmada).The purified p subunit was 
concentrated to 1 6 mg/ml. as determined by Bradford assay (Bradford, 
1 976). by ultrafiltration in a buffer containing 20% glycerol. 20 mM Tris 
buffer (pH 7.5). and 0.5 mM EDTA. The protein solution In glycerol was 
used as such for all further crystaliration trials, using a sparse matrix 
method (Jancarick and Kim. 1991). Crystalization was rapid, with many 
different conditions yielding crystals within a day or two of the first 
experiments. Optimization of several of these conditions led to three 
different crystal forms (Forms I, II. and III) that are suitable for high- 
resolution X-ray diffraction analysis. 

All crystals were grown using the hanging drop method (McPherson. 
1990). The drops were Initially set up by mixing 2 p\ of protein and 2 
nl of a -reservoir' solution, and equilibrated against the reservoir at 
room temperature or 4«C. The unit eel! constants and space groups 
of the various crystal forms were determined by a combination of oscil- 
lation data collection (see below), single counter diffractometry (using 
a Rigaku AFC5 diffractometer), and precession photography. Form I 
crystals are In space group PI (a 086.8 A, b = 73.9 A. c- 65.7 A. 
a o 75.2'*. P = 86.8*. y « 81 .S**) and are grown at 4*C from reservoirs 
containing 13%-15% isopropanol. 100 mM CaCla. and 100 mM MES 
buffer (pH 7.5). Form II crystals are obtained at room temperature 
(21 •C) from reservoirs containing 30% polyethylene glycol (average 
M =400) 100 mMCaOa. and 100 mM MES (pH 6.5). The space group 
is PI (a.41.7A.b-72.9A,c = 65.5A.a»74.6«.p«85.1».Y"82.2«)- 



Forms I and II are related, with the asymmetric unit in Form I being a 
doubling of that in Form II. Finally. Form III crystals are obtained at 
roqm temperature (21 **C) from conditions very similar to those that 
• yield Form II crystals, except that the pH is lowered to 6.0. This crystal 
form is in space group P2i (a = 80.6 A, b = 68.3 A. c-82.3 A. 
Po 114.2*'). 

All three crystal forms prow to large sizes (0.3 x 0.4- x 0.7 mm*) 
and diffract strongly to 2 A resolution. The presence of the rotational 
symmetry axis in Form III crystals makes them much more suitable for 
osclllatton data collection than the lower-symmetry PI forms, particu- 
larty for the simultaneous measurement of Bijvoet pairs, and thus the 
structure detemiination was carried out using this form alone. Based 
on molecular volume calculations. Forms II and 111 contain a dlmer of 
the p subunit In the asymmetric unit, while Form I contains two dimers. 
Assuming this sloichtomelry, the volume per unit mass, V,, as defined 
by Matthews. Is 2.5. 2.3. and 2.6 A' for the three forms, respectively, 
within the range typical for protein crystals (Matthews, 1968). 

Data Collection and Structure Determination 
X-ray Intensity data collection was canied out by the oscttlatlon method 
(Amdtand Wonacott. 1977), using a Rigaku R-AXIS IIC Imaging phos- 
phor area detector, mounted on a Rigaku RU200 rotating anode X-ray 
generator (Molecular Structure Corp... Houston). Typical crystal-to- 
detector distances and exposure times were 153 mm and 15 min. 
respectively, for 2** oscillations. Data processing and reduction were 
done entirely by software provided by Rigaku (Table 1). 

The staicture delemnination was carried out by the multiple Isomor- 
phous replacement (MIR) method. The amino acid sequence of the P 
subunit shows that the molecule contains four cysteines (Ohmori et 
al.. 1984). at least one of which is reactive to N-ethylmaleimlde (Johan- 
son et al.. 1986). The search for isomorphous heavy atom derivatives 
was Uierefore focused on mercury compounds, and two good deriva- 
tives were obtained using mercuric chloride and ettiyl mercury phos- 
phate. The binding sites for mercury are similar In both cases, with 
mercuric chloride and ethylmercury phosphate reacting with ttiree and 
four cysteines per monomer, respectively (Table 2). Although crystals 
of the native protein have reasonable lifetimes In the X-ray beam, 
allowing cornplete data sets to be collected on single crystals, the 
derivative crystals exhibit much eariier radiatron decay. For the ettiyl- 
mercury phosphate derivative, Uiis was overcome by measuring data 
at -5*C. The mercuric chloride-treated crystals were not stable at the 
' tower temperature, and a total of four crystals were used for data 
collection at room temperature (Table 2). For both derivatives, mear 
surement of anomatous differences was optimized by aligning crystals 
so Uial the 2-fold rotation axis (b*) was precisely along ttie oscillation 
axis, leading to simultaneous measurement of Bijvoet pairs. 

Difference Patterson maps for the mercuric chloride derivative 
showed a small number of strong peaks (10-16 standard deviations 
above the mean density value) in tiie y « 1/2 Haricer section, and heavy 



Table 1. Refined Heavy Atom Positions 



Derivative 



Relative 
Occupancy 



X 


y 


z 


Derivatized 
Cysteine 


Distance to 
Sulphur (A) 


0.26 


0.40 


0.44 


79 


1.88 


0.41 


0.80 


0.99 


79* 


2.14 


0.00 


0.86 


0.09 


180 


1.60 


0.84 


0.50 


0.52 


180' 


1.69 


0.83 


0.39 


0.88 


333 


1.95 


0.00 


0.54 


0.53 


333' 


1.51 


0.26 


0.40 


0.43 


79 


1.37 


0.41 


0.80 


0.98 


79' 


1.68 


0.01 


0.86 


0.09 


180 


1.52 


0.84 


0.50 


0.52 


180' 


1.81 


0.83 


0.39 


0.88 


333 


1.90 


0.00 


0.54 


0.53 


333' 


1.59 


0.81 


0.38 


0.93. 


260 


1.60 


0.02 


0.50 


0.49 


260' 


2.16 



HgCIa 



EMP 



1 

2 
3 
4 
5 
6 



3 
4 
5 
6 



1.0 
1.0 
0.6 
0.6 
0.6 
0.6 

0.6 
0.4 
0.8 
0.9 
0.3 
0.3 



Sites not present in HgCb derivative: 

7 0.9 

8 1.0 



The positions are fractional coordinates In Uie unit cell, and ti^e prime on me cysteine number indicates that it is in tf^e second monomer. 
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Table 2. Statistics for Data and Derivatives 



Native 



HgCU 



HgCI, II EMP 



Number of 

crystals 
Concentration 

(mM) 
Soaking time 

(hr) 

Resolution (A) 
Measured 

reflections 
Unique 

reflections 
Completion (%) 
Rwnerge* (%) 
Mean 
isomorphous 
difference" (%) 
Phasing 
power* 
Mean figure of 
merit 



1 



- 2 



2.4 
101.590 



12 

2.78 
86.408 



12 

3.0 
42.097 



2.79 
61.458 



30.957 19.660 16.148 19,568 



94.4 
7.29 



0.67 



93.3 
7.87 
24.4 



2.1 



97.3 
7.41 
25.6 



2.1 



94.1 
6.78 
17.8 



1.4 



The two HgCli data sets are for differenUy oriented crystals (see text). 
• - Ihl/EnS-lw. Where Iw is the scaled Intensity of the T observa- 
tion of reflection h. and U is the mean value. * J:| Fw - Fp|/LFp. where Fph 
and Fp are the scaled derivative and native structure factor amplitudes, 
respectively. ' Phasing power: rms heavy-atom structure factor / rms 
lack of closure. ^ 

atom positions were readily determined by manual InspecUon. Addi- 
tional sites were found by difference Fourier techniques. Heavy atom 
parameters were refined and initial phases were calculated using the 
program HEAVY (Terwilliger and Eisenberg. 1983), Including anoma- 
lous differences for only those reflections that were fully recorded on 



one image (71 .7%-and 70.2% of the data had anomalous measure- 
ments included for mercuric chloride and ethylmercury phosphate, 
respectively). The quality of the anomalous data was Judged to be 
good, since the anomalous difference Patterson map recapitulated 
the major peaks In the difference Patterson map. For calculating and 
reflning the MIR phases, the mercuric chloride data were divided into 
two sets. The first set consists of data (including anomalous differ- 
ences) from three crystals, each of which had been oriented with the 
b*-axis paralle: to the oscillation axis. The second set consists of data 
from a crystal mounted in an arbitrary orientation and does not Include 
anomalous differences, and better results were obtained by treating 
these data separately in the phase calculations. Summary statistics 
from the heavy atom phasing procedure are given In Table 1. 

The MIR phases were further Improved by solvent flattening (Wang, 
1985) using a program written by W. Kabsch (COMBINE). These 
phases were then used to generate a 3 A electron density map. Pre- 
dominant features of the structure, such as the donut shape, the two- 
layer archHecture with Jiefices In the middle and ^ sheets outside, and 
the approximate 6-fold symmetry, were Immediately recognized. The 
program BONES (Jones and Thirup, 1986) was used to generate a 
skeleton representation of the density, which rnade evident the fact 
that the slnjcture Is made up of six copies of very similar nKXlules. 
An additional feature that emerged at this stage was that the 2-fold 
noncrystallographlc symmetry apparent inlhe mercury posttlons corre- 
lated extremely well with a 2-foid noncrystallographlc symmetry in the 
electron density. This 2-fold axis Is only 12» away from the crystallo- 
■ graphic 2-foId screw axis, and was therefore not Identified in self- 
rotation functions. 

A partial model of poly-Ala chain was built for 80% of the structure 
falriy readily, using the program FRODO and a data base of protein 
strufetures(JonesandThirup. 1986). During this process, partial atomic 
models were repeatedly used to improve the phases by phase combi- 
nation with the MIR and solvent-flattened phases, using COMBINE. 
Although the noncrystallographlc symmetry was never explicitly used 
in the phase refinement process, it provided a useful check on the 
accuracy of the phases, as did the unexpected but obvious 3-fold 
symmetry within each monomer. The topology of the protein fold and 





Raure 10. Stereo View of the Electron Density , , ^ 

calculated phase. The model used Is the final refined one. 
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the locations of the N- and C-termini of the monomers were clearly 
recognized at this stage. A complete model of one monomer with the 
correct sequence (Ohmori et aL, 1984) was readily touilt (except for a 
loop of 10 residues that was left as poly-Ala) by assuming that the 
mercury atoms had bound to cysteine residues, and tracing the se- 
quence out from these points. The second monomer was generated 
from this model by the noncrystailographic symmetry operaUon. 

This atomic model was refined against 3.0 A native X-ray data t?y 
least-squares refinement with both monomers treated independenUy. 
using the program X-PLOR (BrOnger. 1988). Two hundred steps of 
Powell optimization and 40 steps of Individual B factor refinement 
smoothly reduced the R factor from 46.2% to 24.9% with no manual 
Intervention. Indicating the accuracy of the Initial model. Repeated 
model building using FRODO and least-squares refinement using 
X-PLOR were carried out A total of five rounds of simulated anneafing 
refinement were then carried out, using Initial and final temperatures 
of 1000 K and 300 K, respectively (BrOnger et al., 1987; Wels et al.. 
1990). The resolution of the native data Included was extended to 2.5 
A. and 1 50 well-resolved solvent molecules (interpreted as water) were 
included. The current nuxlel has the entire sequence built in. and has 
unbroken backbone electron density from the N- to the Otennlnus In 
difference Fourier maps. The R factor Is 18.9% (for 27,614 reflections 
witti |F|>2a(F)) and ttie nns deviation from ideal geometry is 0.017 A 
for bond'lengtiis and 3.5*» for bond angles, witti good stereochemistry 
for the backbone torsion angles. Representative electron density for 
the final refined model Is shown In Figure 10. al the dtmer interface. 
Although the electron density is weak for a few surface sidechalns 
(arginines and lysines in particular), the backbone density is strong 
throughout tiie structure, including all surface loops. The orientation 
of the carbonyl groups is dear for almost all residues (Figure 10). 
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Summary 

The crystol structure of the processivity factor re- 
quired by eukaryotic DNA polymerase 6, proliferating 
celt nuclear antigen (PCNA) from S. cerevlslae, has 
been determined at 2.3 A resolution. Three PCNA mol- 
ecules, each containing two topologlcally Identical do- 
mains, are tightly associated to form a closed ring. 
The dimensions and electrosUtIc properties of the ring 
suggest that PCNA encircles duplex DNA, providing a 
DNA-bound platfonn for the attachment of the poly- 
merase. The trimeric PCNA ring is strikingly similar to 
the dimeric ring f onned by the p subunlt (prbcessWity 
factor) of E. coll DNA polymerase 111 holoenzyme, ¥rlth 
which It shares no significant sequence Identity. This 
structural correspondence further substantiates the 
mechanistic connection between eukaryotic and pro- 
karyotlc DNA replication that has been suggested on 
biochemical grounds. 

Introduction 

The rapid replication of chromosomes relies on DNA poly- 
merases that initiate replication In response to regulatory 
signals, achieve high processivity without dissociation 
from the template, and then disengage rapidly and restart 
replicatton elsewhere as needed (reviewed by Komberg 
and Baker, 1991; Kuriyan and O'Donnell, 1993; Stiilman, 
1994). Mechanistic studies on the chromosomal repli- 
cases of Escherichia coli, bacteriophage T4, and eukary- 
otes have shown that these muitisubunit polymerase com- 
plexes include a distinct processivity factor, variously 
refened to as a sliding clamp or a DNA-tracking protein, 
that Is required foir nondissociative DNA replication. The 
clamp is first loaded onto primed DNA by auxiliary proteins 
in an ATP-dependont process, and subsequently the poly- 
merase machinery attaches to the clamp and starts pro- 
cessive DNA replication. The utilization of a processivity 
factor that Is distinct from the catalytic subunits allows for 
t)oth rapidity and control. During chromosoihal replication, 
additional DNA clamps are loaded onto each newly primed 
template on the lagging strand, and the catalytic machin- 
ery can switch from one clamp to the next after an Okaraki 
fragment Is completed (Stukenberg et ai.. 1994). On the 
other hand, sequestration of the clamp or inactivation of 
the proteins needed to load the clamp onto DNA might 



provide a way for the cell to control the progressk>n. of 
DNA replication. 

Mammalian proliferating cell nuclear antigen (PCNA), 
so named because of its Initial discovery as a cen cycle- 
dependent antigen (Miyachi et aL, 1978), is the processi- 
vity factor or sliding damp for DNA polymerase 6 (Pol 6) 
and Is essential for the replicatkui of SV40 and BPV viral 
DNA by this enzyme (Prelich et al., 1987a, 1987b: Wold 
et al., 1988; Tan et al.. 1986; Bravo et al., 1987; MuUer 
et al., 1994). In yeast, the POL30 gene encoding PCNA 
is essential for cell growth, showing a requirement for 
PCNA In chromosomal DNA replication (Bauer and Burg- 
ers, 1988). PCNA is also a processivity factor for Pol c, 
- another essential eukaryotic DNA polymerase (Monrison 
etal., 1990;BurgerSi 199irteeetal., 1991b; Podustand 
HQbscher, 1993). In additton, PCNA and Pol 5 also play 
essential n>ies Iri the repair of damaged DNA (Shivji et al.. 
1992; Zeng et al.. 1994). In nontransfdrrhed cells, PCNA 
Is found In complexes with a variety of cyclins, cydin- 
dependent kinases^ and the p21 protein inhibitor of cycfin- 
dependent kinases, also known as Wafi or Cipl 0<iong 
et al., 1993; Zhang et al.. 1993). This suggests a potential 
role for PCNA in the response to cellular regulatory sig- 
nals, particularty since the p21 protein can bind directiy 
to PCNA and thus inhibit DNA replicatk)n (Waga et al., 
1994; Rores-Rozas et al., 1994). 

Reconstttutton of total replicatton of DNAcontaining viral 
origin sequences reveals that PCf^A Is required for the 
complete replication of both the leading arid the lagging 
strands by Pol 5 (Waga and Stiilman, 1994; Muiler et al., 
1994). DNA polymerase a and prlmase lay down short 
stretches of RNA and DNA on the template to start DNA 
replicatton. Recognition of the resulting template-primer 
Junction by tiie multipolypeptidereplication factor C(RFC). 
also known as activator 1 . results in the loading of PCNA 
onto the DNA and the Inhibition of furifier synttiests by 
Pol a (Tsurimoto and Stiilman. 1990; Lee et al., 1991a). 
Pol 6 ttien binds to PCNA and carries out processive DNA 
replication, ettiier continuously for the leading strand or 
discontinuousty for the lagging strand (Okazakl fra^nent 
syntfiesis). Elimination of any of RFC. RCNA. or Pd 5 
prevents leading strand synthe^ and results In tiie gener- 
ation of incomplete OkazakI fragments (Waga and Still- 
man, 1994). In these viral replication systems, tiiere is no- 
apparent in vitro requirement for Pol e. However, diving 
chrorhosomal DNA replication, which is undoubtedly niore 
complex. Pol e may also play e role In lagging strand syn- 
thesis in a PCNA-deperident manner (Burgers, 1991; Po- 
dust and HQbscher, 1993). 

The broad outiine of the mechanism of processive DNA 
replication In eukaryotes that Is now emerging has ele- 
ments in common witii the mechanisms of the well-studied 
Pol III holoenzyme of E. coli and ttte T4 DNA polymerase 
(Stiilman, 1994). The functional equivalents of PCNA in 
ttiese pfokaryotic replicases are tiie p subunlt of Pd III 
and the gene 45 protein of T4 phage (Cha and Alt>erts, 
1988; Komberg and Baker. 1991; Kuriyan and CDonnell, 
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1993) (here referred to as 0 subunit and gene 45 protein, 
respectively). Although, the processMty factors function 
by forming a strong association with DNA, the interaction 
is topological rather than chemical, and these molecules 
cannot normally bind to DNA without assistance from addi- 
tional factors (Stukenberget al., 1991;Tlnl(eretal., 1994; 
Kong et al., 1992). Thus, another required component of 
these systems Is the protein complex (y complex In E cdi, 
the products of genes 44/62 in T4 phage, and the RFC 
complex in eukaryotes) that recognizes and binds to the 
duplex-single-strand junctions that are Initiation points for 
replication and then function to k)ad tiie DNA damps onto 
DNA in an ATP-dependent manner (Komberg and Baker, 
1 991). Somd sequence similarity has been noted between 
the subunits of the y complex, T4 phage genes 44/62, and 
the RFC complex (CDohnell et a!.. 1993; Li and Burjjers, 

1994) . 

The picture of a sliding clamp tiiat tracks DNA while 
holding the polymerase onto the template Is consistent 
with Uie results of elegant biochemical analyses of the 
interactions with DNA of p subunit, gene 45 protein, and 
recently, PCNA (Nossal and Alberts. 1984; Stukenberg et 
a!.. 1991, 1994; Jarvis et al., 1991; Burgers and Yoder, 
1993; Tinker et al., 1994). Using radioisotope labeling or 
cross-linking, it has been shown tiiat tiie clamps do not 
load onto dosed DNA witiiout assistance from the ATPase 
subunits of the polymerase. Once toaded onto duplex 
DNA, they can track atong it unless prevented by obstruc- 
tions, such as bound transcription factors or hairpin struc- 
tures in single-stranded DNA. Stable complexes of these 
proteins and circular DNA can be dissociated by lineariz- 
ing ttie DNA and allowing the protein to fall off tiie ends. 

An insightful analysis of such experiments carrled'out 
for tiie E. coil Pol III system led to tiie suggestion that the 
P subunit might form a toroidal structure tiiat could endrde 
DNA (Stukenberg et al., 1991). This was confirmed by 
the detennination of tiie crystal structure of tiie p sybunit, 
which revealed tiiat a p dimer fomis a dosed drcular ring 
witii a hole in the middle that is large enough for the pas- 
sage of duplex DNA wfth no steric hindrance (Kong et 
al., 1992). Several otiier features of tiie stmcture, most 
strikingly tiie electrostatic potential generated by ttie mole- 
cule, make It likely that the protein functions as a ring 
through which DNA Is threaded, thus serving as a DNA- 
bound platform for the polymerase machinery. Evidence 
for tills model Is provided by. electron microscopic visual- 
ization of tiie DNA complexes -formed by the T4 gene 45 
protein (the clamp) and the gene 44/62 proteins (the damp 
loaders), which reveals strands of DNA encircled by disk- 
shaped structures witii dimensions matching those of the 
P subunit dimer (Gogol et al., 1992). 

There is no significant sequence similarity between 
PCNA (258 residues), gene 45 protein (227 residues), and 
the p subunit (366 residues) and^ ttierefore. no reliable 
condusions could be drawn reganJing possible structural 
relationships between tiiem. However, an unexpected fea- 
ture of the three-dimensional strudure of tiie p subunit 
immediately suggested such a connection. Despite tiie 
lack of internal symmetry in the sequence, each molecule 
of p subunit consists of three domains of identical topology 



that interact to fonn an approximately 6-fold symmetric 
dimer (Kong et al., 1992). This structural modularity and 
the fact that PCNA and T4 gene 45 protein are approxi- 
mately two-thirds the size of p subunit suggested that the 
fonmerprotisins might form trimeric rings, with each mono- 
merof the protein contributing two domains that are similar 
to those found In ttie p subunit The seqiiiances of PCNA 
and gene 45 protein were aligned wKh ttie sequences of 
the domains In the p subunit so as to preserve the hy- 
drophobic nature of ttie Internal core and to maintain net 
positive charge surrounding the hole in ttie ring (Kong et 
al.. 1992). However, the extremely leyel of sequence 
Identity (-10%) between ttie structures made ttie se- 
quence aligriment and ttie corresponding structure predic- 
tion rattier unreliable. 

We have now detennined ttie crystal structure of PCNA 
from ttie yeast Sacchardmyces cerefi^e by muftiwave- 
lengtt) anomalous diffradion (MAD), end we show that It 
does Indeed form a trimeric ring witti blose topological 
symmetry to the staidure of the p subunit dimer, as pre- 
dided prevtously. The accuracy of tiie eariier predidion 
appears, however, to havis been more a triumph of intuition 
than of reason since we find tiiat the sequence alignment 




Bgurel. Qectron Densftj^ltop Calculated Using F>hases Derived 
from MAD Measursmems 

The red Snes represent electron density contours at ttie 2^ o level 
In en electron density map calculated using ampGttides and phases' 
derived eoMy fronfi ttw MAD experlmem and modified using SCKi ASH 
(Zhang and Main. 1 990). This map. calculated at 3 A resolution, Incor- 
porates no prior Information about ttie atomic stmcture and shows 
deartyttim a PCNA trlmer Ibrins a closed ring wtth a hexagonal outiine. 
For reterenoe. ttM Ca becktx)ne of ttie final refined model to shown 
In yelow. The cnrstallographlc 3^ld axis passes ttirough ttie center 
of ttie rfcig. Extra electron density features at ttie upperleft and upper 
right and to the lower right are due to three symmetry^etated rings 
ttiat are packed at an angle to ttie central one. The mercury atoms 
(two per monomer) are shown as blue spheres. This figure was gener- 
ated by program XT (Jones et ai., 1991). 
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Figure 2. Schematic Digram and Slereodiagrarn of a Moriomer of PCNA 
riAfO Schernatic diaflrarii of wa)ndafy atmctural elemecte 
thownas arrows Eternerits within the hw>topologi<»ByWen^ 
finrtdomairiareaequenti^ 

of Mcondarvrtnjcture are Identified uniquely Ixy the lal)els of the atrands or helices. For example, the AiBt toop is the cm t>etween oA, 
andpsTrSi lengths of the con^ 
^fwmbOT of residue 

J^SJ^iS^^ threoKlimensional structure of one monomer. Tt« N4emiinal domain b sfwwn in «jd ^ ^terminal don^ 
teSi The Interdomain connecting loop Is shown In red. This stereodiagram was generated using MOI^IPT (Kraufis, 1991). 



on which the prediction was based (Kong et al,i 1992) Is 
largeiy incorrect. The crystallographic analysis presented 
here now provides firm evidence that the dose correspon- 
dence between form and function that was obsewed in 
the prokaryotic Dr4A polymerase processlvity factor is pre- 
served In the corresponding eukarYotic protein, and It pro- 
vides an accurate structural model for the detailed investi- 
gation by mutagenesis of the various Interactions mediated 
by PCNA. 

Results and Discussion 
Structure DetermlnMlon 

The crystal structure of PCNA from S. cerevlsiae was de- 
termined using phases derived from MAD (Hendridcson, 
1991), measured for a.mercury complex of the protein. 
The experimentally derived phases resulted in an electron 
density map of high quality (Figure 1), on the basis of 
which amodelforthePGNAstructure was builtandrefined 
against data to 2,3 A resolution to afinal R value of 18.8%, 
IVIercuration of PCNA results In crystals that diffract signlfi- 
cantiy better than those for the unmercurated protein 
(Krishna et al.. 1994), and the structure of the mercury 
complex will be used for all tiie analysis that follows. The 
structure of unmercurated PCNA has been refined at 3 A 



resolution, and it confirms that no significant conforma- 
tional changes are induced as a result of mercuration (see 
Bcperimental Procedures). 

Structure cf the Trimeric Ring and the 
Component Domains 

The crystal structure of PCNA reveals a dosed circular 
ring, which results from tight assodatipn between three 
nrx>lecules that are related by a crystallograpt^lc 3-fold axis 
(Rgure .1). Intemal symmetry in each monomer of PCNA 
leads to approximate hexagonal symmetry in the trimer 
(Figures 1, 2, and 3). This intemal symmetry is not re- 
flected In the sequence of PCNA, but Is oti>^ous in the 
secondary structural elements and in the fold of the poly* 
peptide chain (see below). The ring fomned by PCNA is 
strikingly similar to that formed by the P subunit In tenhs 
of shape, size, and intemal architecture. In both struc- 
tures, an outer layer of 6 0 sheets forms a drcuiar collar 
that supports 12 a helices lining the inner surface of the 
ring (Rgure 3A). The notable difference between PCNA 
and p subunit is ttiat the former is a trimer (258 residues, 
and 28,916 kDa per monomer), whereas the latter is a 
dimer (366 residues, and 40.583 kDa per monomer). Hy- 
drodynamic (Bauer and Burgers, 1988) and dynamic light 
scattering experiments (data not shown) indicate that, in 
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Figured. Computetr-Generiated Images of 
PCNA Trimer and 0 Subuntt Dimer 

(A) Ribbon representations of the potypepUde 
backbones of PCNA and ^ subunlt, with hypo- 
thetical duplex DMA. In this schematic repre- 
sentation of a trtmer of PCNA (left) and a dimer 
of p subuntt (fight), strands of p sheet are 
shown as flat rft)bohs, and a helices are shown 
as spirals. Individual monomers wtthin each 
ring are distinguished by (fifferent colors. A hy- 
pothetical model of duplex B4orm DNA is 
placed tn the Qsometrlca] canter of each struc- 
ture to Uustrata the hypothesis that the rings 
encircle duplex ONA. this figure was gener- 
ated using QUANTA (Molecular Simutations) . 

(B) Molecular surface of the PCNA trfnMr. The 
•olvent-acoessUe surface of the trtmer Is dis- 
played, with the surface colored differentiy for 
each of the monomers. The phosphate back- 

. bone of a strand of .^andard B-form DNA is 
indicated by the recTaptr^. Only 1 of the 2 
strands of duplex DNAis shown for clarity. This 
figure was generated using GRASP (Nichotis 
et at, 1991). 



eolution, the molecular mass of PCNA is 80 kDa-85 kDa. 
corresponding to a trimer and suggesting that the crystal 
structure reflects the state of the molecule In solution. 

The dose structural correspondence t}etween the rings 
fomned by trlmers of PCNA and dimers of p subunit arises 
because there are. six topologically identical domains in 
each of the rings. Each PCHA monomer contributes two 
domains to the trimeric ring, as predicted previously, 
whereas each P subunit monomer contributes three to the 
dimeric ring (Kong et a!., 1992). The domains consist of 
two antiparallet p sheets that approach each other closely 
at one edge, as in a p sandwich, but that are splayed 
apart at the other by two a helices that pactc against the 
core of hydrophobic residues between the sheets (see 
right panel, Rgure 2). The resulting angle between the 
sheets is about 45^, and the structure resembles a wedge 
with thjd two a helices fronting the btunt edge of the wedge. 

The two domains In PCNA are Joined finnly together by 
the fonnation of an extended p sheet that forms a contigu- 
ous surface across the interdomain boundary. The other 
P sheets In the domain are extended across the intermo- 
lecular boundaries. The curvature of the sheets and the 
angle between them results in circular symmetry: a rota- 
tion of 60* about an axis passing through the center of 
the ring rotates the N-termlnal domain of a monomer into 
close correspondence with the C4omtinal domain (rms 
deviation of 1 . 1 A for 55 Ca atoms within secondary struc- 
tural elements). A rotation of 120*> maps the N-termlnal 
domain of one monomer to the corresponding domain in 



the next monomer in the trimeric ring, a consequence of 
the crystallographic symmetry. This results in a 9-stranded 
antiparallet P sheet being formed across the intermolecu- 
lar interface, with interactions t)etween the C-terminal do- 
main of one monomer and the N4erminal domain of the 
next one being topoipgicalty similar to those between the 
N- and C4ennlnal domains of the same monomecr 

The modular construction of PCNA that is apparent in 
the repeUtive arrangement of the domains also manifests 
itself at the level of the secondary structural elements, 
which form a2-fold symmetric pattern within each domain. 
As In the p subunit, the eritire ring results from the 1 2-fold 
repeat of a p-a-P-P-p motif around a circle (for exarhple, 
pA-oA-pB-pC-pD orpF-aB-pG-PH-pl In either domain). 
A short strand (pE in both domains of PCNA) is absent In 
the p subunit and can be considered an insertion in the 
loop connecting PD to PF. The structural con-espondence 
between the two halves of a domain is only approximate: 
rotation by 180** about an axis perpendicular to the 3-fold 
axis of the ring and passing between the two helices re- 
sults in a rough alignment of the two halves, with positional 
deviations of as much as 3 A-4 A for Ca atoms in corre- 
sponding re^dues. 

The connecting loop between the N- and C-terminal do- 
mains is particularly interesting as It might be important 
for the maintenance of the relative orientation t)etween 
the two domains in a monomer. The last strand of the 
N-tenninal domain (Pli) is the farthest from the first strand 
(PA2) of the C-temiinal domain, necessitating a long con- 
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necting loop that runs across strands that make up the 
interdomain p sheet (this loop Is marked In red in right 
panel, Rgure 2). Seven hydrophobic residues In this loop 
extend toward the surface of the underlying p sheet and 
Interact with hydrophobic side chains from both domains. 
In addition, polar and charged groups on the toop interact 
either directly or via water molecules with residues In the 
p sheet (resulting in five direct hydrogen bonds, seven 
water-mediated hydrogen bonds, and one Ion pair). The 
nature of this cross-back connection between domains 
Is conserved in the ^ subunit (which has one additional 
connecting loop owing to the presence of three domains 
within each monomer) and evokes comparison with the 
bands that serve to keep the staves of a liarrel in place. 

Intermolecular Interactions In the PCNA Trimer 
The central element of the Intermolecular interface is an 
antiparallel interaction between strands pDa In one mole- 
cule and pit In the next, which results in a total of eight 
main chain amide-to-carbonyl hydrogen bonds across the 
Intermolecular Interface. This Interaction Is analogous to 
that formed by strands pD, and Pb at the Interdomain inter- 
face within a monomer, with the same number of bridging 
hydrogen bonds, the Importance of these Interactions at 
the trimer Interface Is illustrated by a yeast PCNA mutant 
(Ser-1 15-Pro in pli) that is cold sensitive for growth. The 
introduction of a proline in the pii strand is likely to dismpt 
the hydrogen bonding Interactions at the tntemiolecular 
interface and, indeed, the monomer-trimer equilibrium of 
the mutant PCNA Is strongly shifted toward the monomer 
fonm (Burgers et al., unpublished data). 

Two helices, oAi in one molecule and aBi In the next, 
pack against each other at the Interface. The hydrophobic 
core of each domain is fonned by the packing of these 
(and other) helices against the p sheets, and the fonnation 
of the intennoieculaf interface leads to the burial of one 
edge of the hydrophobic core In each domain. Specifically, 
the side chains of lie-78, Ala-112. Tyr-IIA. and Leu-116 
In one molecule and Leu-1 54, VaI-1 80. lle-1 81 , and lle-1 82 
in the other are solvent exposed In separated monomers, 



Figure 4. Electrostatic Potential Maps for a 
PCr4A Trimer 

These maps were generated and displayed us- 
ing the program QRASP. The aide chains of 
lysine, argtnlne, aspamglne, and glutamate 
residues were assigned single positive or nega- 
tive charges as appropriate, and all other resi- 
dues were considered neutral. The etectro- 
•taUc potential was talculated using a uniform 
dielectrfc of 80 and an Ionic strength corre- 
sponding to 0.1 M Naa for the solvent and a 
dielectric of 2.0 for the protein tnterior.Orthogo- 
nal tvKHflmenslonal sOoes of the electrostatic 
potential map are shown, passing through the 
center of the ring. The tMlghtest red and blue 
colorBtions corragpond to electrostatic poten- 
tials of -2.0 and +2.0 luT/e. respecth^. 
wttere 1^ to the Bottzmaim constant, T ts the 
temperature, and c to the charge of the elec- 
tron. The potypepQde t>acld>one of PCNA is in- 
dicated by an ^pen tutw. 

tMJt are buried In the trimer. in addition, the side chains 
of lle-175 and Phe-185 undergo partial burial upon forma- 
tion of the Interface. The total surface area buried at each 
Interface of the PCHA trimer Is 1 500 which is compara- 
ble to the 1400 of surface buried at each interface in 
the p subunit dimer (calculated using a 1 .4 A radius prot>e) . 

One characteristic feature of the p subunit dimer is the 
presence of six potential Ion pairs at each Interface, two 
of which are buried, with alt the positively charged groups 
contributed by one molecule and all the negative groups 
by the other. This polarization of the erids of the monomer 
would ensure formation of a head4o4ail dimer, with re- 
sulting asymmetry In the faces of tiie ring (Kong et al.. 
1992). Head4o-tall Interactions also occur in the PCNA 
trimer, although ion pairing Is not a prominent feature of 
the PCNA Interface. Only one ion pair is observed at the 
interface, between Asp-1 50 In aA2 and Arg-110 in Ph. The 
Intermolecular hydrogen bonding networic is less exten- 
sive In the p sqbunit, which has only four hydrogen bonds 
between strands pOa and pit, compared with eight In 
PCNA. Despite these differences, the intermolecular inter- 
actions in both PCNA and p subunit are sot)stantial, sug- 
gesting that energy would have to be expended to break 
the ring. ^ 

Potential Interactions with DNA 
The structure of pCNA, like that of the p subunit, has been 
detennined in the at>sence of DNA. Indeed, the ability of 
PCNA to slide along duplex DNA (Tinker et al., 1994; M. 
ODonnell, personal communication) makes it unlikely that 
a proteln-otigonudeotide complex that Is sufficientiy sta- 
ble for crystallography could be generated. Despite the 
inability to visualize a PCNA-DNA complex crystalk>- 
graphically, tiie toroidal shape of PCNA suggests immedi- 
atety that It can fulfill Its DNA^acklng function by encir- 
cling ttie duplex (Rgure 3). Furtiier examination of the 
electrostatic potential generated by the molecule and the 
size of the hole in the middle of the ring provides more 
evidence to support this hypothesis. 
Uke p subunit and gene 45 protein, PCNA is highly 



Cell 
1238 



o 



acidic. A PCNA trimer contains 81 ^aspartate, 57 gluta- 
mate, 54 lysine, 24 arginine, and 6 histidine residues, 
which results In a net charge of -60 If the histidines are 
assumed to be neutral. This is even more negative than 
the p subunit ring, which carries a net charge of -22. How- 
ever, like the p subunit, the distribution of charge on the 
PCNA ring results In a region of positive potential sur- 
rounding the inner surface of the ring and the hole In the 
middle. The qualltath^e effect of the distribution of charged 
residues can be seen by calculating a map of the electro- 
static potential surrounding the molecute (Sharp and 
Honig, 1990; Nicholls et al., 1991). Visualization of the 
map (Figure 4) shows that PCNA generates a large region 
of negative electrostatic potential around it, consistent 
with the high negative charge on the ring. The exception 
is the central channel enclosed by the 1 2 d heiices. This is 
a region of positive potential, Indicating that the phosphate 
backbone of DNA can pass through the ring without elec- 
trostatic repulsions. The posithre potential Is a conse- 
quence of nine lysine and arginine residues within each 
monomer. In or adjacent to the heiices that encircle the 
inner surface. Alignment of a variety of PCNA sequences 
shows that lysine or arginine is conserved at each of these 
nine positions, Indicating a functk>nal importance for the 
charge distributton. 

As in the p subunit, each of the central helices Is oriented 
orthogonal to the k>cal direction of the phosphate back- 
bone of standard B-form double helical DNA placed In 
the center of the ring. This arrangement of helices would 
prevent PCNA from interacting closely with the grooves 
of DNA and may restrict the DNA-proteIn Interactions to 
norispectfic contacts with the phosphate backbone. In or- 
der to obtain a sense of the dimenskms of the rings fomned 
by PCNA and p subunit, we have computed the rotationally 
averaged atomic density. Consider each of the rings to 
be oriented horizontally, such that the rotational symme- 
try axis Is vertical. Moving horizontally outward from the 
center of the ring, the density of atoms starts to rise at 
about 12 A-18 A from the center for PCNA and at about 
14 k'20 A for the p subunit A precise internal diameter 
is difficult to define because of the variable lengths and 
conformations of side chains lining the Inner surface. Tak- 
ing the distance at which the atomic density rises to half 
maximal as a reference value, we obtain internal diame- 
ters of approximately 34 A for PCNA and 38 A for p sub- 
unit The cross-sectional diameter of the B-forni DNA dou- 
ble helbc, estimated tiie same yray, Is approximately 18 A 
(-21 A in the case of A-form DNA) and, thus, duplex DNA 
can pass through the center of the ring formed by either 
PCNA orp subunit without hindrance (see Figure 3). How- 
ever, tiie somewhat smaller hole in PCNA, witii some side 
chain atoms extending to within - 12 A of tiie center of 
the ring, means tfiat direct contact between phosphate 
oxygens and protein side chains may occur even when 
the DNA is placed in the geometric center of the ring. 

Analysis of ttie density of atoms also leads to estimates 
of -80 A for the external diameter and -30 A for the 
thickness of the rings formed by both PCNA and P subunit. 
Cryoelectron microscopic images of DNA and tiie acces- 
sory proteins of tiio T4 phage DNA polymerase reveal a 
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Figure 5. Alignment of PCNA and p Subunit Sequences 
The N4ermlnal domains of S. cerevisiae PCNA, human PCNA. and 
plant (rice, Oryia aatlva) PCNA are labeled yPCNA-1, hPCNA-l, and 
pPCNA-1. respectively. Other mammalian PCNA sequences are al- 
most Identical to that of human PCNA, and the plant sequerKO was 
chosen to Ilustrate sequence vaftabKty. The corresponding C^onnlnal 
domains are lat>eled yPCNA.2, hPCNA.2, and pPCNA-2. The three 
domains of 3 subunit are labeled BETA-1 , BETAr2, and BETA-3. The 
yPGNA and p subunit sequences are angned l)ased on the two throe- 
dimensional structures, and the secondary structural elements are 

boxed and labeled, in domain 2 of PCNA, oB Is extended into a 3,a 
helix. This Is Indicated by broken tines. For yPCNA and the p subunit. 
the stippled bars IrvTicate resklues that are inaccessible to sohrant in 
the crystal structures of the trimer and the dimer, respectively. No 
consWeratkxi of sequence conservation was used In the selection of 
these residues. However, for the hPCNA and pPCNA sequences, the 
stippled bars indicated residues that are similar In property to those 
that are buried In yPCNA. The asterisk under the PCNA sequences 
Indicate the locations of conserved basic residues that contribute to the 
positive electrostatic potential In thecenterof the ring. The numbers to 
the right are the last residue numbers In each row, and the distance 
between the vertical lines is 10 residue In the yPCNA sequence. 



number of disk shaped objects, referred to as "hash marics", 
that appear to endrde DNA (Gogol et al., 1992). These 
hash marics or disks are approximately 85 A wide and 
25 A-.3P A thick and resemble PCNA or the p subunit in 
cross-section (see Figure 5a of Gogol etal., 1992). Assum- 
ing that the disks represent the gene 45 protein, these 
electron microscopic images are a direct observation of 
a processivity factor endrcling DNA. 

Structure-Based Sequence Comparison between 
PCNA and p Subunit 

An unambiguous alignment of the sequences of PCNA 
and p subunit was generated for each of the domains on 
the basis of the knoWn crystal structures (Rgure 5). The 
resulting sequence identity between the two domains of 
PCNA is - 15%, which Is below the level required for the 
reliable alignment of sequences In the absence of struc- 
tural Infonnation (Sander and Schneider, 1991). Neverthe- 
less, the two domains are dosely superimpc^able. with 
an rms deviatton of 0.9 A in Ca positions for 55 residues 
that form the core secondary stmctural elements (residues 
that are in corresponding p strands or a helices in each 
of the domains of PCNA and p subunit). There is no dis- 
cemlble paltem of sequence identity between the two 
domains. Rather, the changes In the sequence are dis- 
tributed throughout the structure, with compensatory 
changes In the hydrophobic core resulting In preservation 
of the architecture of the domain. 
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Rgure 6. Delineation of the Secondary Structural Elements of PCNA 
The identification of residues with P strands arKl a helices was nrtade 
on the basis of hydrogen bonding patterns In the crystal structure of 
PCNA, and the residue numbers of the boundary residues are shown 
for the two PCNA monomers. Underneath each domain, the previous 
prediction of the secondary stnictura (Kong et a!., 1992) Is indicated. 
Note that this prediction to out of rosier wtth the correct secondary 
structural boundaries tn most cases. 



The level of similarity observed between the domains 
of PCNA Is comparable to that observed between PCNA 
and p subuntt or to that seen within the p subunit The 
pairwise sequence ideritities range from 6% (comparing 
domains 2 and 3 of p subunit) to 15% (the two domains 
of PCNA). The rms deviations In Ca positions for the core 
secondary structural elements range from 0.9 A-1.7 A. 
The structure-based alignment of the PCNA and p subunit 
sequences (Rgure 5) shows that each of the p strands 
and a helices of the p subunit Is preserved In PCNA. As 
in the comparison k}etween the domains of PCNA, there 
is no pattern of sequence identity t>etween PCNA and p 
subunit. The hydrophobic nature of the residues in the 
cores of the domains is Indeed preserved, but the con-e- 
sponding sequence changes are so extensive that simple 
sequence alignment in the absence of structural informa- 
tion on tx)th proteins would be very difficult The previous 
alignment of the sequences of PCNA and p subunit (Kong 
et al., 1992), based only on inspection of the structure of 
the latter, is consequently in serious error (Rgure 6). Only 
5 of the 18 p strands were predicted In the conecX posi- 
tions, and 2 of the 4 a helices are incorrectly aligned. The 
PE strands of PCNA were not predicted since they are 
absent in the p subunit We note that this alignment relied 
only on manual i nspection and Intuition and that it would be 
interesting to Icnow how well computer-based predictions 
might perform in this case. 

The sequence alignment t)etween yeast PCNA and p 
subunit has been extended to Include the sequences of 
human and a plant PCNA as representative of the diversity 
of PCNA sequences. In contrast with the alignment of 
PCNA with p subunit. this extension to other PCNAs Is 
very reliable. Yeast PCNA is 35% and 39% Identical in 
sequen.ce to the human and plant versions, respectively, 
which is sufficient to ensure that the secondary structural 
framework will be closely preserved (Sander and Schnei- 
der, 1991). There are hardly any insertions or deletions 
between these sequences, which share clearly recogniz- 
able patterns of sequence identity throughout their 
lengths. Consequently, the stnicture of yeast PCNA pre- 
sented here is a reliable guide to the three-dimensional 
chain fold of human PCNA, In which there is considerable 
Interest. 

What of the structure of the T4 phage gene 45 protein? 
Given the close correspondence t)etween the structures 



of PCNA and p subunit, the electron microscopic images 
of the T4 accessory proteins (QogcA et a!.. 1992), and the 
fact that gene 45 protein Is a trimer (Jarvis et al., 1989). 
it is very likely that gene 46 protein wllj adopt a similar 
architecture. Nevertheless, given the difficulties encoun- 
tered in generating an accurate sequence aUgnmem be- 
tween PCNA and p subunit. It woukS be best to postpone 
further discussion until the three^nmenstonal structure of 
gene 45 protein Is determined experimentally. 

Intefaction of PCNA wfth Other Molecules 

During phases of DMA replicatkm or repair, PCNA inter- 
acts with the RFC ooniplext wfth fol 6, etkI periiaps wi^ 
other components of the repncatkxt or repair machineries. 
At other times; PCNA Is found oomplexed to the ceQ cyde 
proteins p21,cyclinTdependent kinase, andvariouscycnns 
(Zhang et al., 1993). Reli^tiyjaly little Is known about the 
specific regtons of PCflA ttiat ore Important for mediating 
these Interacttoris. Deletk)n'm(itagenesls has been used 
to map the regions of PCNA that potentially Interact with 
antibodies (Brand et al., 1994) and with D-type cycfins 
(Matsuoka et al.. 1994), but the relathrely large deletions 
used makes it difficult to Interpret these results In terms 
of kx^lized regions In the folded stmcturB of PCNA. The 
two most prominent k>ops In PCNA. the interdomain con- 
nector (resWues 118-135; see right panel. Rgure 2 and 
Rgure 3^ and the DsEz k>op In the second domain (resi- 
dues 183-195), resemble protuberant "handles* on the 
trimer and are likely sites for Intermdecular Interaction. 
Not surprisingly, these \oops are highly antigenic (Boos 
et al.. 1993; Brand et al., 1994). 

Rve muta^ns In yeast PCNA have been ktontified that 
suppress the phenotype of mutattons In the large sutxinit 
of the RFC complex (the cdc44 gene product) (McAlear 
et al.. 1994). Based on the previous alignment of PCNA 
and the p subunit (shown here to be IncorrecQ, It appeared 
that the five mutatbris clustered in one region of the pre- 
dk:ted stmcture of PCNA. VVe have reexamined the kx:a- 
tions of these mutattons In the crystal structure of PCNA 
and find that all occur at positkms that are completefy or 
partially buried in the structure of the trimer. The mutatkxis 
are therefore likely to cause changes In tfie structure of 
PCNA.particulartyinthevk:inityoftheP8Ke^tthatextefxls 
across the Intermolecular Interface. Two of the muteiions 
are in strands that form this sheet (Ala^l 1 2^Thr in pit and 
Val-203-^AIa In pFt). Two other mutattons (Leu-151-«^r 
In aA2 and Ser-135-»Phe In PAjJ Involve side chains that 
pack against this sheet The fifth mutation (Gly-69-»Asp 
In pPi) Is particulariy Interesting since the Ca atom of Gly- 
G9 Is tightly packed between the Ca-CP bonds of two side 
chains (Met-119 and lle-121) that are part of the interdo- 
main crossover kx>p. This toop appears to play an im- 
portant structural role In maintaining the Integrity of the 
ring, and mutattons at positton 69 are likely to cause signifi- 
cant changes In the structure of the k>op In the vicinity of 
the Intermolecular p sheet That these mutattons rescue 
phenotypes associated with defective RFC complexes 
suggests that the large subunit of RFC might interact with 
PCNA in the vicinity of the trimer Interface or, perhaps 
more likely, that the mutations might affect the monomer- 
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Table 1. X^y Data Colloction StaUstics 



Mercurated PCNA 
Ref erenice Data set 

30.O-2.3 
121^ 



Statistical Paramelors 
Resolution (A) 

Unitcen(A) 

Numl)or of observed reflections 30o]303 

Number of unique reflections 26,G08 

Completeness of e» data f%) 93.7 

Overall I/o(I) 24^ 

Completeness of outer sheU (A) 02.7(2.34-2^) 

1/0(0 of outer theO (A) 6.2(2.34^2^ 

Rl l WI Wl (%) 



6.9 



Mercurated PCfM MAD Phasing 



X, ■ 1.0095 A X, . 1.0063 A m 0.9920 A 



30.0-2^ 
121.4 
120,602 
20,934 
73.6 
16.6 



30.0-2^ 
121.4 
121,809 
20,935 
72.6 
16.6 



30.0-2.5 
121.4 
126,513 
20,832 
74.3 
16.5 



Unmodified 
PCNA 

30.0-3.0 
121.7 
95,433 
12,273 
86.3 
16.1 



S^S^^ f^'JSi^ 68iJ%(2.64^ 80.9%(3.05^0) 
3^(2-64-2.60) 3.2(^54^ 3J2(^M^) 09(3.05-3.0) 
6,6 8.6 6.5 



Rve different data sets are shown: four for mercurated PCNA and ona frv tinms^^^ tvMu a «rw — T" — ^ 

•et measured for mercurated PCNA using aTSK^.^^^ «««««« for the data 

Is calculated In each case by oonslderir3^FrwJS^^ number of unique refIeciior« 

three wavelengths Indicated (X,. U W Thelirt are for data measured at the synchrotron at the 

carried out usi WtSS^' to Sc^cS^ ^^l^^^^ £^ J? ^ ""^^ ^ ^ ^ 

agreement factor, on imensity, between redundZlJ;^^ ^ ^ '^"^ ?? «nd 1 



trimer equilibrium of PCNA In a way that compensates for 
the mutations in the large subunK. The available data do 
not allow us to distinguish between these alternatives, but 
the crystal structure should be a valuable resource in de- 
signing mutations to test this and other Interactions of 
PCNA. 

Conclusion 

The crystal structures of PCNA and the p subunit are the 
only views obtained so far of any of the components of 
the molecular machines that replicate chromosomal DNA. 
Unexpectedly, the three-dimensional architectures of 
these processivity factors are strikingly similar, despite the 
differences In their oligomerization states and the lack of 
similarity in their sequences. These stmctures explain the 
tight affinity of PCNA and the p subunit for DNA, which 
Is topological, but now bring to the forefront the question 
of how the associated ATP^lependent auxiliary proteins 
(y complex in E. coll and the RFC complex in eukaryotes) 
act to load these closed rings onto DNA. Sequence homol- 
ogy between the components of these auxiliary proteins 
(OT>onnell et al., 1993; U and Burgers. 1994) indicates 
that some commonality wilt likely be found in their mode 
of action In eukaryotes and prokaryotes, consistent with 
the stnjctural homology in thaporresponding processivity 
factors. Very little is also known at present about the struc- 
tures of the catalyUc subunits of the chromosomal rep- 
licases and to what extent their attachment to the pro- 
cessivity factors will resemble each other. The intriguing 
possibility that PCNA might play a role in coupling the 
regulation of DNA replication to cell cycle-dependent sig- 
nals points to another important class of interactions In- 
volving PCNA that needs to be understood In structural 
tenns. Continuing analysis of the threeKlimensional struc- 
tures of the components of these replicases is likely to 
lead to new insights into the basic biological process of 
DNA replication. 



Experimental Procedures 
Crystaflizatkm and Data Collection 

prevtously (Bauer and Burgers, 1968; Yoder and Burgers. 1091- 
. Kj^hna^^ 
wgh arbitrations (-2 M) of ammonium sulfate et pH 5.6. Crystals 
grown from solutions containing mercurated PCNA (obtained bv tb- 
^^^^^MMgd,. (oUowedby dialysis to remove unre- 
acted mercury) diffract significantly betlerthan unmodified PCI^ 
fnoasureaWe X^y data extending to 2.3 A and 3^ 
ratfld andi^^ oystate, respectively (Krishna et ai.. 1994) 
Tlw bestX-fay diffraction data tor mercurated PCNA were obtained 
^ng a single crystal tfiat was flash frozen to -ISO'C (space group 
P2,3,a - b - Cm 121;2A,one monomer of PCNA In the asymmetric 
uniQ using a Rigaku fWXISJIC detector m^^ 
X^y sowoe (Tabte 1). The anomalous difference Patterson map ca|. 

cutoed from these data Is of high ouanty. with strong Marker sectiM 
P^lcscoriespondingt^ 

heights are re o above the mean, using data to 3 A resdutkm) In 
con^ao^morphous difference Patterson maps are noSy. with 
Fterkwsecfionpealcsthatareonlyha^^ 

S2 Patterson map. suggesting significant nonisomor- 

Pn»mbelwen crystals of the mercurated and unmercurated protein 
The diffractkm from crystals of the unmodified protein is substantiaUy 
weaker than that from menxirated PCNA, and the qu^^ 
te omspondingly poor (Table 1>, We therefore chose not to use' data 
for the unmodified protein In the determination of phases and we 
roOwl totMd oh anomatous dtffractk)n from the mercury moms to 
detenntoe the phases experimentally. 

Phase Datermlnation by MAO 

f Jiff? was canned out at the Howard Hughes Medical 

Institut© synchrotron resource (beamline X4A at the National Synchro- 
fronUg^S«ffoe.Brookhaw 

^Jll!?^ ^ mercurated PCNA. frozen m 

-160»C. was used to collect data atX^y wavelengths corresoondina 

(0.9920^of the X^ absoiption spectrum of the crystal. The data 

•uccesslve oscillations. The data coliectfon spanned a total oscillation 
ra^ of 72». corresponding to two 36« sectore that are related by a 
'^rj^^^ axis. Data for each nfth^ 

wavelengms were measured sequentially for a particular oscillation 
range (and that related by a rotation of 1 eo^) before advan^ tom^ 
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Table 2. Crystatlographic Data 




Ob«efved Ratio of 
Anomalous Diffraction 
Differenoes 


Scattering 
Factors 




Wavelength (A) 


3li 3U 




f 


f 


31, - 1.0095 
3W - 1.0063 
3U " 0.9920 


0.070 0X37 
0.078 


0.042 
0.036 
0.076 


-24.65 
-16.12 
-10.87 


7.26 
11.60 
9.66 



Bijvoet differences from dtia measmd at tfw three wavelengths indi- 
cated are shown ki the dlaoonal. elemems of the table. Dispersive 
differenoes are shown as off^dlagonal elements. The values of the 
scattering factors f and r are Bsted for each wavelength: f and f 
represent tfte values of the real arfd.fanaolnary components of the 
anomalous scattsftng factor. The values at and 3^ were refined 
using MADLSa with the values at 3U fixed at their calculated values 
(W. A. Hendrfdcson. personal communication). 



next one. The resulting images were processed using the program 
DENZO. and the integrated intensities were scaled and merged using 
SCALEPACK (Z. Otwinowsld. personal oomrhunlcatibn). The anoma- 
lous differervces were also merged and reduced to a unique set of 
indices prior to all further analysia. This increased the completion of 
the anomalous difference measurements for tt>e unique set of indices 
while taking advantage of the high symmetry of the cubic space group, 
the resulting wavelengtti-dependent statistics for the anomalous dif- 
ferences are shown in Jable 2. 

The algetxalc fbrmuIa;lion of Hendridcson was applied to the re- 
duced muttiwavelength data to obtain phase protsabifities. The mean 
ftgures of merit, reported by ttie program MADABCD (Pahier et el., 
1990). are 0.71 (to 3 A) and 0.64 (to 2.5 A). The phases were improved 
by solvent flattening and histogram matching using the program 
SQUASH (Zhang and IMain. 109(Q. initiany at 3 A resolution and utti- 
mately at 2.5 A resolution (see below)- 

The map calcutated itsirig the phases ot>tained'from SQUASH re- 
vealed unamtriguously that PCNA forms a ring-efiaped. 3-fold symmet- 
ric stmcture with dose topological aimllarity to the E. coG Pol ill ^ 
subunlt (see Figure 1). The map had deaify resofved side chain fea- 
tures that could be interpreted readily in terms of the sequence of 
yeast PCNA (Bauer and Burgers. 1990). This revealed that although 
the topology of the chain fold resembled that of the p subunit, the 
specific asslgilmem of secondary structural elements in PCNA that 
had been suggested previously (Kong et at. 1992) was largely incor- 
rect (Figure 6). The sequence of PCNA vras therefore bu&t into the 
electron density map Independently (there Is one morK>mer of PCNA 
in the asymmetric unit of the crystal). An atomic model for approxi- 
. mately 60% of the sequence was tHiBt into the first electron density 
maps using the program 'O* (Jones et at.. 1091). after which phases 
calculated using tfte atomic model were oombirted with the MAD 
phases to 2.5 A using StQMAA (Read, 1986). A model corresponding , 
to the complete PCf^A sequence was bunt and refined by least-squares 
optimization and simulated annealing using XtPUOR (BrOnger, 1988; 
BrQnger et al., 1990). witti the stereochemical parameters of Engh and 
Huber (i 991). The modefcontalns 21 47 nonhydrogen atoms, including 
117 water molecules. The R value of the final model Is 18.8% for 
22,454 reflections between 5.0 A and 2.3 A with |F| > 2 o (|F|). The 
rms deviation from ideality In tx>hd lengths and angles Is 0.012 A and 
1.9*. respectively. 

A prot>lem with the phases obtained by the MAD analysis is that 
they are only 65% compile to 2.5 A resolution (70% to 3.0 A), leading 
to breaks in electron density features. The data set measured in the 
laboratory is 63% complete to 2.5 A resolution, with a mean fractional 
difference of 12.8% to 2.5 A on |F| with respect to the data measured 
at the synchrotron. This led us to use the relatively complete set of 
laboratory structure factors as input to the SQUASH program, with 
phase prot>abIlities talcen from the MAD experiment where avaliable. 
Unphased reflectioris were also included in tfie SQUASH calculation. 
t>ut with initial phase probability coefficients set to zero. Solvent flat- 



tening and histogram matching were used to Improve and extervi the 
phases, starting with data to 3.5 A resolution and proceeding to 2 5 
A In 20 cydes. The resulting phase set is 93% complete to 2.5 A. and 
the resulting electron density map te of exceptionally high quality and 
could be Interpreted unambiguously in most regions of the structure 
. with the exception of certain surface loops. 

The improvements in the phase accuracy can be Ulustrated by com- 
paring the average value of the phase discrepancy between the final 
phase set calculated from the refined model and the experimental 
phases at varioua stages. The average discrepancy is 65*» (for data 
corresponding to Bragg apadngsbeevveenSxA^ A) for the phases 
resulting from the MAD analysis. AppOeation of SQUASH to these data 
reduces the discrepancy to 49». A frifther Improvement is afforded by 
twitching to fhe complete laboratory data set the phase discrepancy 
after SQUASH drops to41* for those reflections that were phased by 
the MAD analysis and Is 61 • fx reflections tor which no experimental 
phases were available. 

Comparison of the Structures off Mercurated 
«id Unmercurated PCNA ^ . . 

The two mercury atoms that are l)6und to PCNA are coordinated by 
3 of the 4 cysteine residues in the protein. The first mercury cross-links 
the side chains of Cys-30 and Ofdsity forming covalent bonds with 
the sulfur atoms of both residues. The second mercury is bound to 
Cys^ The occupancies of the mercury sites were estimated by com* 
puting the R value as a function of occupancy and were set to 1.0 
and 0.6 for the two sites, respectively. The sulfur-mercury distances 
were restrained to ^.55 A during refinement, the value obtained by 
averaging sulfur-mercury bond lengths obsenred In several small mol- 
ecule stnictures (Wells, 1984). The first mercury cross-linitt cysteine 
residues that are on opposite sides of the p sandwich of domain 1. 
This internally cross-Onked region Is dose to the 3-fold rotation axis 
of the crystal and is packed ctose to two symmetry-related copies of 
the same regkM in differem PCNA trimersl Stabifization of the structure 
in this regton may contribute toward improving the crystalline order. 
The second meicury atom lb bound near the inner surface of the nng 
and Is quite distant from crystal oontapts. 

The final refined model for mercurated PCNA was used to initiate 
least-squares refinement for a model for the unmercurated protein at 
3.0 A resolution, which was followed by simulated anneaflng and man- 
ual adjustment of the model. The present R value is 22.4%. writh 1 0 
water molecules Included. The rms deviation from Ideality in bond 
lengttis and angles is 0.013 A and 2.1 «, respectively. Comparison of 
the stnictures of mercurmed and unmercurated PCNA shows that 
Ik) gross changes In the stnicture occur upon mercuration. The rms 
deviation In Ca positions, after superimposing the monomers, is 0.66 
Aoveran. Thehigher resolution of theX-raydatafownercurated PCNA 
makes It the more accurately determined structure, and ft is the model 
used In aO the analysis presented In this paper. 
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ABSTRACT 

The remarkable processlvfty of cellular repUcative 
DNA polymerases derive their tight gri^ to DNA from a 
ring-shaped protein that encircles DNA and tethers the 
polymerasetothe chromosome. Thecrystal structures 
of prototypical 'sliding clamps' of prokaryotes (p 
subunit) and eukaryotes (PCNA) are ring shaped 
proteins for encircling DNA. Although p Is a dlmer and 
PCNA Is a trimer, their structures are neariy superim- 
posable. Even though they are not hexamers, the 
sliding clamps have a pseudo 6-f old symmetry result- 
ing from three globular^ domains comprising each p 
monomer and two dom'alns comprising each PCNA 
monomer. These domains have the same chain Wid 
and are neariy Identical In three-dimensions. The 
amino acid sequences of 11 p and 13 PCNA proteins 
froifi different organisms have been aligned and 
studied to gain further insight Into the relation between 
the structure and function of these sliding clamps. 
Furthemiore, a putative embryonic form of PCNA is the 
size of p and thus may encircle DNA as a dimer like the 
prolcaryotic clamps. 

INTRODUCTION 

airomosomal replicases are multipioteiii assemblies that poly- 
merize thousands of nucleotides without dissociating from DNA, 
In three well characterized systems (E;Co/f, eukaryotes and 
l)acteriophage-T4) this reniarkable processivity is achieved by a 
ring-shaped protein CsUding clamp*) that eitcircles DN A and 
anchors the polymerase to the template (reviewed m 1,2). The 
sUding clamp is assembled on DNA by a multiprotein ^clamp 
loader* apparatus that recognizes a primer terminus and couples 
ATP hydrolysis to assemble the clan^) around DNA (3). fa 
eukaryotes the clamp loader is the 5-subunit RF4Z complex (also 
known as activator-l) and the sliding clamp is the PCNA protein 
which confers processivity to DNA polymerase 5 (Pol 8). Tlie 
clamp loader of the prokaryotic DNA polymerase m holoenzyme 
(Pol ni) is the 5-subunit Ycomplex and the clamp is the P subunit 
In the T4 system the clamp loader is the gene 44/62 protein 
complex (g44/62p) and the clamp is the product of gene 45. 



The first indication of the; processivity factor's circular shape 
came from the observation that the p subunit was tighdy fastened 
to circular DNA but easily dissociated firom linear DNA, 
suggesting that P slides off the ends of DNA (4). This hypothesis 
was' enforced by the Obs^rvatipn that the exit of p from linear 
DNA was blocked by proteins'Tbijund near the ends of die DNA. 
The three dimensional structureof the EcaU p subunit showed it 
to be a dimer in die shape of a ring with at central cavity large 
enough to accommodate duplex DNA (5) (Fig. 1). 

The sliding clan^) in the T4 system has been observed on DNA 
by cryoelectron microscopy (6). Tlie clamps not only appeared to 
endrcle the DNA, but thek appearance also indicated that fliey 
may slide along the DNA. These clamps are presumably gene 45 
protein trimers encirqling the DNA in a fashion similar to that of. 
the p dimer. Using a linear ten^late it was shown fliat gene 45 
protein can support processive replication in the absence of die 
clamp loader (7) and diat die T4 sliding clamp consists solely of 
gene 45 protein (8). Interestingly, die gene 45 protein can 
assemble on a circular tenq)late in die absence of die clamp 
loader, provided a macromolecular crowding agent is present <9). 

Evidence for the topological binding of PCNA to DNA came 
from die observation diat PCNA alone can support replication by 
Pol 5, in die absence of RF-C complex, only when die DNA is 
linear and has a double-stranded end (10). TTiis result suggests 
diat PCNA can slide along duplex DNA until it reaches the 3' 
terminus where it interacts widi die polymerase to init^te 
processive rq)lication. Supporting evidence for PCNA sliding 
along duplex DNA comes from photocrosslinking experiments in 
which PCNA can be crosslinked to DNA following assembly 
around DNA (11). However, crossliriking of PO^ A to DNA is not 
obsCTved upon linearization of the duplex (11). Furthomore, 
usmg tritiated human PCNA it has been demonstrated that die 
protein can be loaded onto nicked circular DNA. but upon 
linearization of dieDNA it slides off die ends, similar to die JB.co// 
p subunit (N. Yao, Z. Kelman and M. 6'Donnell, unpublished). 
Proof diat PCNA and p are structurally similar was obtained upon 
solution of die three dimensional structure of the yeast Saccharo- 
myces cerevisiae POMA which showed it to be a trimer widi an 
overall shape very similar to diat of Kcoli p (12) (Fig. 1). 

The sliding clamps (p and PCNA) have a 6-fol(i t^jpearance, yet 
diey are not hexamers. In p die 6-fold ^pearance results from 
diree globular domains which comprise each monomer, while in 
PCNA each monomer contains two domains. In lx)th prokaryotes 
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Figure 1. Ribbon rcpicscntation of flic polypeptide backbones of a trimcr of 
PCKA (A) and dimer of P (B). (Q is a Rq)crimpoation of the PCNA and p 
rings. Strands of P sheet are shown as flat ribbons and a hdiccs are shown as 
spirals. Tbe mdnomerfc units witiiin each ring are distinguished by diffei^t 
colors.- ■\.-.-\[:\-,^'\-._.:/'-:-fy-X'X-^--y''':;' -.'r/ 



and dikaryotes the : t6pol6©[cad ::cham :f^^^^ of leiadi domain is 
comprised of two a helicbs siJi^ 
sheets (nine in Pa^A);The:doina^ 

chain fold and are nearly superim^bsable^ tfube dimrasions yet 
share no significant . homology "(Svll). '^ sheet structure is 
continuous all arouhd the perim^ of fl^ ring ^d the a helices 
that line the' central ca^^ty aiej^^ 
phosphate backb<)ne of the J)N^ ^ . : v 

Here we describe the diverse roles play ed^ clamps m 

nucleic acid metabolisni. Alsp/ih ian a^ 
insight into the relationship of strd 
tic and cukaryoticisUding cl?^ 
seveialp arndPCNAproteinsficomc^^ 
= aligned; • 



'stiTictiindffeat^ 

Th6 tiue« features of the ciamp s^ 
fimcHon aier ® the central c^^ 
-iQmterfa(ib€i^ 

loading onto DNA; and (iii) the sfirf^&^b^^^ 
widi -pthCT subunits of the polymei^ pellular 
com^tonents, To gain additional in^igSt l^^ 
ant for; function, we aligned the anuho iadd sequences of 11 fiill 
or partial prdkaryotic P subunits (FigVi) and 13^^ (Fig. 3). 
These sequences were obtained fipm the databases by searching 
. for known DNA sequences of;d>wJN^/the g^^ 
genes kiiown to encode for PCNA. To date, only the Kcoli p and 



PCNA of yeast and human have been shown functionally to act 
as DNA sliding clamps. 

The P sequences in Figure 2 are split into three parts which 
correspond to the three globular domains. The sequences in 
Rgure 3 are divided into the two domains of PCNA. Glu and Asp . 
are colored red and Lys and Arg are blue to display more clearly 
the distribution of charged residues, within the protein. The 
alignments were made such that no g^s were intiodnced betweai 
elements of secondary structure and that cvai dK>ugh these 
proteins have high net negative charges, the readues aligning 
with the a heUces of the central cavity have a net positive cfaar]^ - 
These alignments genially follow a third aiteria» that portions 
corresponding to the hydrophobic amino add residues (Phe, Leu, 
He, Met, Val or Ala) in Kcoli^ md yeast PC2«IA that are 
inaccessible to water, are also occupied by hydroplwbic^ 

The caitral cavity 

In genial these clan^)s are acidic proteins CTables 1 and 2). Tfe 
pi of the PCNAs are slightly lower then die pis of P subunits and 
gene 45 protein has an intermediate pi (pi = 4.8). Tlic distii 
of charges on the ring is asymmetric. The outer surface has a 
strong negative electrostatic potential ahdtheinsideof the central 
cavity has a net positive electrostatic potentiaL Hence, these 
clamps would be repelled by DNA, but afto- assembly onto DNA, 
the ring miay even interact favorably witii it A possible function 
of die negative outside surface may be to increase the spedfidty 
of assembly on DNA by preventing the ring fiom associating wifli 
DNA by itself, thus constraining it to enlist tiie help of the clanq) 
loadCT and ATP. Alternatively^ negative surfece may help 
dest2J)ili2e local mtena^ with DNA. A notable raception to 
this rule is the P subuiut from B.dphidicola witii a pi of 9,0. 
Hwice. providing this P subunit frmctions like P of KcolU an 
overall acidic charge would appear not to be essential to clan:q> 
functioiL-: ■ . - - ^'c/--'';, 

Both P and PCNAlmye 12 a helices iaflie central cavity allof 
whidi are perpendicular to the DNA. The a helices are long 
' enbugh to span die major and nimor grooves and dim 
molecular crossbars to prevent the claniip from entering the 
grooves of DNA during sU<Ung, The cc helices have an overall 
positive potential as a conse^quence oteight (PCNA) or 12 (P) Lys 
or Arg residues within the ring. However, there is at least one 
acidic arninb acid (Asp or Glu) iri every a helix of p. and in six 
out of the 12 a helices in PCNA. These negatively charged 
residues may -be needed for a balance to prevent tile local 
iiiteraction between tiie. clamp and DNA. Altiiougji die oyaidl 
structore of the.rings.are isimilar; the central cavity in P (35 A) is 
largCT flian die one in^PCNA (34?A) W has a sli^y more oval 
; shape (Fig. 1)../ -vv f - • 

Tli^duner interface;. : ^ 

The interfaces between the prptomars of P aid PCNA are formed 
by P sheet Tbe p sheet is part of tte continuous layer of sheet 
structure on tfic outside of fliese rings. Three interactions may 
. contribute to stabilization of die iiiiterface: G) hydrogen bonds 
. between die p sheets; Qi) a small hydrophobic coie; and Cii) 
putative ion pairs. In the following sections the amino adds that 
COTtribute to die fornfiation of die intof ace close to die N-terniinus 
will be referred to as die 'head . intaface' and those near die 
C-terminus will be referred to as die 'tail interface' (Fig. 4). 
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FTgutt 2. Sanien<« aii^unent of P subunits ftom different prpkaiyotes. Hic ikoteins aic aligned, to show maxinmm similaiity with procdns directly above and bdo 
Identical ledAies'between neighboring protons have a vertical dash. Gaps were introduced to maximize h(MnoIogy.,The sequences ffle spUt into three parts which 
coircsiWiKl to tf» three globular domains of the £co/i P subum^ 
fee iSm add niimbeis at the end of each domain of the EcoW 
thataicconservedinaUspedes arc indicated in Aebottom row of eadidomaimTheso 

BacmussubtiUs\BACSU C25)l. Micwcoccus luteus [MICLU (76)lStn:ptomycescoeUcohr[S:rRCO (27)], Pseudomonaspmida IPSEPU(28)], Serratiamarcescens 
tSBRMA (29)1. SabnoneUa typhimurium [SALTY (29)3, Escherichia coU [ECOU (30)]. Proteus mirabiUs [PROMI (3 1)]. Mycoplasma capricolum [MYCXZA (32)], 
ActmobadUus ptetuvpnetanomae [ACiVAL (33)]. ..r . 



Hydrogen bonds: Hydrogen bonds between the two antiparallel 
fi sheets inay contribute to stabilization of the interface in both P 
and PCNA. The difference, however, is that wiflim the PCNA 
intrface there are eight potential hydrogen bonds between the 
sheets compared to only four in the p subuniL Perhaps the greater 
^tropic force against stable assopiation of a trimer versus a dimer 
requires mbre interactions among the interfaces of the protoniers. 

Hydrophobic forces. Hydrophobic interactions may ^also be 
important in the strength of the interface. Hie PCNA interface 
contains four pairs of hydrophobic amino acids which form a 
hydrophobic core (Fig. 4). In p, howev^, the interface hydro- 
phobic core consists of only two pairs of hydrophobic residues 
(Rg. 4). This may suggest that hydrophobic interactions play an 
important role in the stabilization of a trimeric structure and again 



support the notion that more interactions are needed to stabilize 
a trimer relative to a d^eri 

. In P the two hydrophobic paks; are: Pheio^ Leu273 and 
Leuios to 116272 [W; Fig.4]. Thepositionof the-four hydrophobic 
residues of the interface have been hig^y conserv^ in all the 
bacterial species exainined:(Rgv 2). For exainpte^ Pheio6 and 
Leuiog have been conserved in every species exc^t BMphidicola 
(Pheis changed to Tyr) andM.capncoiMm jjxu is changed to He). 
Thene272 andLcu273 are conserved, though not identical and are 
replaced by eithCT Leu Vd or LeuLeu. . : 

In rcNA the four hydrophobic paks are: Ileys to Leu 154, Alai 12 
to Valigo. Tyrii4 to Ilei8i and Leune to Ilei82 (12). The ei^t 
hydrophobic residues in PCNA are not as conserved as in the 
prokaryotes. In the head interface the four hydrophobic amino 
acids in different species range from being identical to those in 
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poiyhedtosisvinis[AdNPV(14)]. 



S cemnsiae or have been replaced by other hydrophobic ammo 
SSS^SiiiterfecethepictureismorecomplexJnansp^^ 
StyS^obic residue at position 154 is conserved (eiflier T^ 
However, in several species the ammo aads in flie 
SpS vS^Wes 180-183) 

Xamino adds 0br or Cys) or even a (toged residue (Lys) m 

Ton U-w. Ionic interactions between amino acids with 
ippoS charges may also contribute to the stability^of the 
Sa^ There are six putative ion pairs in the mterfacc^of 
?SfB!bionlyoneinyLtPCNA(Fig.4).Interesto^^^^ 
SSeopposite situation fliantiiat of tiiepute^^^ 
forces in the interface where fliere are four pairs of hydro- 
pS^orcidsinPCNAandonlytwoinP.™^^^ 
Cion pairs may playasignificantrolemtiiestrengtii of the 



mosttightiy focused region of negative ch^; fi^^ of seven 

Ses are glutamates (residues 298-304; Rgs 2 and 4). The 

SvetailinterfacesQftheotherbacterialspeciesals^^^ 

foCusedregionofnegativecharge.Conversely.theheadmt^ 

contains the positively charged partners, but tiiese residues are 

scattered over a longer re^on. ' ^ 

^e6 sequences from otiierbacteria each have residues in 

positions Sat correspond to at least tiiree of the ion pairs . of 

etoU 6 and tiiey have additional charged residues in their 

toterfa^regionwhichcouldpossiblyformotherionpai«.The 

SputativeionpairconservedamongallprokJ^^^ 
2eLyS74^iu298pair.TheArg,05-^lU30lpa«:isr^^^^^ 

B.«i; (Gluio5-I.ys30l). Residues ^^f^!^f'°J^, 
buried Arg96-GlU30O of E.coli p is present in all die specie 
eS for B..«W/« '^here botii members of tiie pair are 
missing, suggesting intolerance at burying a lone member o: 
this pair, m PGNA frbm all 13 species studied, tiiere is an Asi 
at position 150 and either an Arg or Lys at position 110. 



o 
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A. D 

Head interface 



Tail interface 



B. PCNA 

* . - . 78 110 pi 

Heacl interface ,X . . . R I A E "y S L K l» 

. 1- . rrM' ' • • . \ •, ■ ■. 

■ {;:: • jd' ••ne' \ V i56\": ' ' • : V. j 

Ftgure 4 Rcadues tf^^ intMfaceofficotf.fi and of ^xenfmioc PCNA. (A) The interface of the EcoU p and (B) the interface of tfie yeast rcNA. 

AiniiK) adds tfiat foith the sheet s^ucture at the dimer interface are underlined. Lines drawn between resides of the twointerfecea indiealb putative interactions vAadt 
stabilize flic structure. Tfe NHaminal domain iriterfacc (head interface) is shown above the C-tenninal domain interface (taU iniiirff^^- 



Tabic 1. Molecular m^ss, net diarge aiid calculated pi pf P subunits from different species (using the DNASTAR program) 



Species 


- . Aniino acids 


MW(Da) 


Net charge (pH .= 7) 


. :- 


EcoU r • 




40589.9 - 


^.982 ■ . 




EsubtiUs : 




42106.6 


-14.388 


4.819 


S^coelicoior 


376 


39959.4 


-18.877 


4.487 




366 


40593.8 


-10.279 


5.095 , 


Rmirabiiis 


367 


40751.3 


-9.286 


5.137 


B.aphidicola 


365 


42057.2 


8.207 


9.063 



Only p subunits for which the entire sequence is known are listed. 
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Tkble 2. Molecular mass, net charge and calculated pi of PCNA from different species (using the DNASTAR.program) 



Spedes 


Amino ddds 


MW(Da) 


Net charge (pH = 7) 




Human 


262 


28899.5 


-16.728 : 


4.457 


Murine 


• .- ■:261; . 


28787.5 


-15:563 < 


' 4.553 


Rat 


'261" ■ 


28751.4 


-16J28 


4.457 


X^laevis 


261 


28899.3 


-17,896 


■ 4.363 . 


D,meUmogaster 




28832.5 


-13.827 


4.543 


Croseus 


■ . ^268'' 


29767.6 


^17^31 


4.460 ' 


D,carrota (short) ' 




29226.1 


-15432 


4.556 


O.conro&z Gong) . 


< 365 ; 


40105.1 


-28.670 


4.343 


0,sativa 


263 


29274.2 


-15.697 


4.503 


Sxerevlsiae 


258 • 


28919.3 


-19.837 


4.261 


S,pombe 


260 


28971.5 


-17.671 


. 4.367 


AcNPV 


256 


28637.5 


-9.072 . 


5.219 



Only PCNA for which the entire sequence is known are listed. 
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BlgureS-HighlyccmswbdaiiiiiMaci^^ 
in the o helices are shown in gieen. 



arids(&omRg. 2) in loops are shown inyellow and tfwselocaled 



The evolutionary consCTvation of charged pairs in fliese 
interfaces suggest they arc important to the function of the 
clanms. Hie most obvious role is to stabilize the dimer interface. 
However, it is possible that the need to exclude water from Je 
buried charges could require cnerjgy and thus destabihze the 
interface, adding to a balance between stability and instabihQr m 
this reriott that may be needed to open and close die mterfece. 
Another possiblerole for chargedresidues at the interface denves 
ftom the high degree of symmetry in the rmg. Computer 
modeling of ahead to head P dimer results in a ring which looks 
like the head to tail ring, having tight interfaces with an 
antiparaUel sheet as ,in the head to taU interface. Hrerrfoie. 
anoSer possible function of the charged residues may be to 
provide specificity to form the head to taU dimer rather than the 
head to head dimer. 

The interface may be a 'busy* region of protem mteraction 
as tiiis is the site at which work may be performed by tiie clamp 
loader to open and close the ring. Hence, anoflier possible role 
ofthecharged residues at theinterface may be to functionin the 

recognition of otiier subunits of the polymerase. Also, if fte 
polymerase (or anotiier subunit) were to bind the clamp by 
spanning the interface it could act as a brace to further stabilize 



the interface tiiereby increasing the stability of the clamp on 
DNA and leading to greater processivity. 

Slidhig clamp interaotioiis with other protdns 

The sliding clamps must interact with at least two components 
of tiieir corresponding DNA polymerase holdcnzyme. the 
clamp loader and the DNA polymerase. However, these sUdirig 
clamps also interact with ceUular conqwnents involved m 
other processes (Table 3). In the T4 system, gene 45 protein, 
interacts witii RNA polymerase to activate late gene transcnp- 
tion (13). Simflarly, the PCNA homologue from the baculovi- 
rusAcNPV is importantfor late gene transcription(14).Beside 

its interactions with Pol IH holoenzyme, the p subunit also 
interacts witii DNA polymerase n an enzyme implicated m 
DNA repair. Human PCNA interacts witii D-type cyclms (15), 
tiie cell cycle dependent kinase (CDK) inhibitor p21 (16,17) 
and the DNA damage induced gene. Gadd45 (18)(Table3).//t 
vitw replication studies have shown tiiat the binding of p21 to 
PCNA inhibits replication (16,17). Gadd45. like p21. is 
induced by p53 upon DNA damage, presumably to block DNA 
replication. Consistent with this notion botii p21 (16.17) and 
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Gadd45 (Jerard Hurwitz. pers. comm.) have been shown to 
inhibit DNA replication in vitro. 



complex (19), suggest that this region in PCNA may be 
responsible for interaction with RF-C and/or with Pol 5. 



'Kiblc 3. Multiple enzymes interact with sliding clamps of prokaryotic and 
cukaryotic DNA polymerases 



Clamp 



Interacts with: 



£co/i P DNA polymerase n 

DNA polymerase in 

T4 gene 45 protein T4 DNA polymerase (gene 43 protein) 

E.COU RNA polymerase (modified by gene 33 protein 
and gene 55 protein) 

Human PCNA DNA polymerase 5 

DNA polymerase e 

D-typccycUns 

p21 CDK inhibitor 

Gadd45 protein 



Hie sequences of the different P subunits in Figure 2 are much 
more divergent than the PCNA genes in Figure 3, due to the 
greater evolutionary distance between the prokaryotic organisms 
relative to the eukaryotes. Hence, identical residues in the P 
subunits from all species may bp, expected to be functionally 
important Among the P subunits analyzed only 15 residues are 
identical in all species (Fig. 2). These conserved amino acids are 
eitha: in loops or in the a helices Iming the central cavity (Fig. 5). 
Interestingly, the conserved amino acids on the loops are all on 
only one face of the rmg. The conserved residues are good 
candidates for interactions with other proteins. Consistent with 
this, the C-tOTnini of P which protrude from this face have been - 
shown to be a touch point for interaction of P with both the clamp, 
load^ and the DNA polymerase, consistent with tiie recognition 
of P by both the polymerase and die clamp loader (V. Naktinis and , 
M. 0*Donhell, unpublished observation); 

The C-terainal amino acids of the £.0^// P subunit are 
important in the binding of botii the Y conq)lex and die poljTmM^ 
(V, Naktinis and M. O'Donnell, unpublished pbsery^on). 
Examimdon of the C-terminus of PCNA shows a notable, hi^ 
negative diarge. In all species there are 3-6 Glu pr Asp rcisiidties : 
adjacent to die C-terminus (Fig. 3). Tlie three dimei^io 
stocluie of die yeast PC^ shows diat the three addic ai^ 
acids form a ^bck)k* wifli the side chains pointing tp the solvent 
(12). This highly acidic region is unique to PCNA and & not 
present in P or gene 45 proteiiL It is tempting to speodate that the 
acidic C-termihus of PCNA, lacldng in P, may invblyed in 
interaction.with cellular regulators. • ; ' ' J 

In the region piebeding the acidic C-tenninus of PCnA ^ict^^^^ 
arc sevCTal sequence similarities to p. In all PGNAs, there is a 
conserved Lys before: the Glu/Asp istretch and in all p subunits 
there is a positively charged amino acid in the Crtenminus (Fig. 
2). Additionally, both PCNA and p have a hydrophobic residue 
(lie or Leu) following the positively charged ariiino acid and both 
have on6 Pro within the terminzd four amino acids. The amino 
acid sequence similarities at die C-terminal region of both P and 
PCNA, togetiier with the observation that the subunits of the 
Exoli Y complex share sequence similarities widi the RF-C 



Dimer and trimer sliding damps are found in both 
prokaiyotes and eukaryotes 

All of the prokaryotic dnoN genes examined here encode proteins 
of approximately the same size (length of 365-378 residues; 
Table 1). Hence, they presumably form dimers like ELcoli p. The 
13 PCNA genes encode ptoteins that are rT2/3 the size of P 
(lengdis of 257-26? residues; Table 2X^;^^^ form 
xnmecs like yeast PCNA. ■• ?; V v; . ^ 

Wliy did oikai|yotes evciye^; t chain 
tcqx>lpgy of die domains siig^^ and P evolved 

from a single dcnnain diat 'ddv^^^l^^ hexanierising or 

d^ fiised. The diree (tomainsV;:o^^^^ two 
fusioDs; die two domaiiiis of alKiNJ\: m^ only one 

fusicHL H^ice, it is pcKSsible diki^L^aryio^ phag&-T4) have 
yet to undergo theVsecond fii^ three domain 

nioncMner like prokaryptes, Alt^n^^ form may 

haveevolyedfromtheprokarybtejfl^^ 
ofonedomain. ^ i ^ 

A triineric ring composed of tlunee {»x>tomers may not be as 
stable as a ring composed of .two protpiri^, having a greats 
probability for protomer disassembly imd'a low^ probability for 
diree nK>noina:s to assodate relative to oiilyt^ 
diat a less stable ring is used to an kdvantagei durin 
. ofthe lagging strand. The lagging strand is 
of fragnacnts. Altiicaigh die claiSps cxik in excess over the 
(K)iymerase inside die cell, di^ a^^ 

pnQ fOT:each OkazaK fragpo^ent-^Hencei^t^^^^^ must be 

recycled. The p subumt is.yery staW and a clamp 

unloading activity was identified to maititafn a pool of available 
clamps widiin the ceU (20).^Aicss stab^^^^ DNA may 

: dissbdate more easily fcm :tli^E^*^^^:P^ polymerase 
. aband(Nis it and dius a spedfic'uidbading ]^^ not be 

• neieded to maintain an intr^ 
7 Itfi^pearstf^ 

:iityp?!^f ririgs, dK)se TOn5)bsed of i^^^ of 
v - a idittX'fo case bf k^doU;^^^^ that a 

ii^bsxiex vd:aoii of P (called P*) is^indiK^/'upo^^ of cells 

^j6JJV ligji. p* is ejqjressed firom^ito^^ widiin the 

V dnaN^^Ttis p^js con^ffiked^pf 273 of normal 

C CCb^aSq^^ 6f f P'^ . dio on 

^^^g^ptt^^ stinm- 
= 1^ 

l}:'-iA^^py0^ dinwiic ^ii^ in:c^^ in an 
V^i^olwtyoD^ 

? ,-^pfeKA g^n^ were isolated iE^m D.c^^ a PCNA 

^ V bf jt^ and the 

'bdikei^^^ akugerpitpteinQf 3^^ kDa). This 

; = is^^t^^ that the 

;;:ei^^ Further; 
: : c iipffiing pqgpMiesis of the amphibiaii, Xehopiis l&vis^ the presence 

; of a 42f kDa pnotein that cross-reacts witib iiritiL-P^ antibodies 
: ^^odiielates -with production of two differentiy sized transcripts 

/ ; it JOTcars that eukaryotes and prokaryotes alike may 
. utilize rings made of dimers and trimers. Speculation as to why 
the chromosomal rcplicase of prokaryotes use the dimer form of 



o 
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the clamp and eukaryotes use the trimer has been treated (23), but 
experimentally derived evidence peitaining to this question 
remains for further study. 
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The proiERSrating-cell nuclear anUgen 
WM-idiaitiien^liMfi^ 

^^^^^a^tfim 40 origto of DNA 

— AanStttxobpe^ sfimqlate processive 
W^n TiS^^V^SP^^^ * ^ ■ primed dn^e- 
;m^Vt0iMfA^ such can be categorized as 

^SlmtGed RF* ami PCN have donoiBtrated 
•^^Ip^ c«ii0ite!y analogous to tvaOHios of 
^^T4DNAp(^mc^ accessory proteins. Aprun- 
te^p«dflc DNA binding activity^ and a DNA- 
ll^?aca»li^ co^ittificd with the muUisnbunit 
K*;4»d are dmOar to the ftanctions of the phage T4 
'i2i^Sln^oinpJai FiirUiemore, PCNA stimnlatwl 
^^TO^ «ea«^ ind ji^ «>««f®^^ analogjms to the 
^'^»~i^t^^^ the ATPase ftinc- 

tptasW/Si'vri^^ Inaeed, some primary 

^Sifet^n PCfjA P'«!!f7* 
^''^djuri bfe^l^tecte^ these results dmonstrate a 

' ia«tiof;tiie#^^ 



i- /5Kv,^ V^^ is.requiT^ ja fonn an 

jie a< 2)jlittleiw{a known aboiat either the 

1 «<;fe^^<a^msiKhte^Wt$^NA repUra^^ in mammalian, 
SS^^S^DNA^^ymerase « (pol «) was long 
^wM^MmW^^ eukaryotes. the 

^^^^j^itudfes^the SY40 sys^ 

^*^^^^«lS?Sj^aiiinencp6lc6^^ 

I « hffl.S'?M^ac<^sci#|ir^^ is responsible for copr- 
mm^m^^V[^i^,^^ and laeanR strands a( 



;vS^^6n fork (3* 8)>Pq1s8 containsa 3'^^ 
sSi^Sia^^can^synthesl2ft DNA processively oit 
SlSSdn DNA in the presence of 

Furth^ sSidi^ demonstia^^ 
J»DNA procteiiive^^ effidenUy on a pnm«I MW 
anriS&<^-DNA (S^^^ in the Prffnceof three 
^*SSbn:factora, replI«tion factor A (RF-f). JCNA, tmd 
i^^SSSfSG(I*rQ(^2^Ri^Aisamultisubum^spNA 

I^^S?p«Sthat£fe^ 

^^-B^of DNArepUcation i« vf/roand funcUons as astmiulatory 
Sbr fSpofi and pol 6 (8. 11-15). TTis latter fiincUon may 

r-V y * - S . / • ■ ■■ ^ : — 

#>il.frnuWiditk)n costs of this article were defrayed in |«rt by page charge 
i^r^SSTarticle must thercfo.^ be hereby mai1c«l Wr,«.m.n« 
taSance with 18 U.S.C. S1734 solely to indicate th.s fact. 



be analogous to the activities of a number of viral,.phage, and 
bacterial ssDN A binding proteins tiiat stimuli^ tijcjT ho^^^ 
ogouspols (2. 3. 16). RF-C is required, along wi^^CNA^ only 
for the^ongation of SV40PNArepUqfiipn ?nd^^ 
to be requited for the coordinated synthesis/bf le^ing and 
lagging strands (5. 17). Several properties^of R?-C. for jxain- 

ple. its function as a pol acccsspcy^jpM'ni'^'l "^^^ 
affiWfor ssDNAbound tocellulow (1^^^ 
it may be involved in tcmplateHprimer. recognition or^^ a 
molecular damp holding the pol onto the template pN A. 
Analogous fuitctions have been suggested for prokaryotic pol 

accessory proteins (1, 2, 16). ^ , -j 

Studies on Escherichia coH and its phages have elucidated 
tiie general mechanisms for die initiation and ^elongation of a. 
DNA replicatibn. Particularly relevant to this discussion ^re 
thestudiesonthebactOTophageT4DNAreplicatiQnTroteins 

remiired at the replication; forte Ifor revie>v, see Gha and , 

>5Upfa(18)kiniephage;pol;encodedbygenc4^^^^^ 

tiie leaaing and lagpngstitmds at a.rephcation forte, forming 

a dimeriii pdl complex. Pol accessory pr^^^^^ 

genS 44 Md 62, fUriction as a DNA-depcndent^TPas^^d 

prifh^r-retb^fiofaprotdn: w^^^ - 

AtPase activiw is stimulated by aiiiotiier protein, 

gene 45; and togetiier the; :genc::44/62: a^^ gcne,45 proteins 

complex and cooperate d stimu^^ 

fo^hg a ik)lhol6enzyme(iii addition; tate^^^^^^^ 

hSdlsSiiingiiW^^^^^ 

Md the :'*piitod^tae!*:prbta encoded by genes 4f an^ 6/ 
a)NA HeUJaisewid^^riii^ also function at tlie replication 
■ foiktouiiwhdvthepai««ta;4^^ 
Oksoaki frignicnt synthesis onTtiie la{;ging.strand, and aug- 
ment pol fiinctiohv We have further characterizedtwo humMi 
ceU rSic^on factors. RF^C and PCNA. and demonstrate 
that they are'itriki encoded by the 

phageT4 gehesv44/62 iahd 45rrespectively. 

I^TEMALS AW) METHODS 

ItepU&t^n rictbi*; Itr^, PGNA, and RF-C; and Prf «. 
RF-A was^urififtlfioihjahuman 293 cell cytoplasnuc extract 
as described (15) ahd a ssDN A-ccllulose fraction («0 Mg/ml) 
vras used. Two sources of PCNA were used m this study. 
PCNA (200 :||tg/ml) purificf from human 2».cd^*3ra 
DUblished piocediire (4) was used in Figs: 2 and 4. PCN A tiiat 
had th&same specific activity as PCNA from human cells was 
produced in E. coli hartx)ring a plasmid carrying the human 
PCNA tDNA seqiiehce (19) under the control of bacteno- 
phage T7 prompter. The£. co/i-produced PCNA was punfied 

Abbreviations: pol. DNApolyracn«c;poI a. DNApolymerasc a; poi 
S DNA polymOTse «; pel 111 ONA polymerase HI; RF-A. repl - 
LtionfactorA:^/replicati6^ PCNA. proliferating-ce I 

^cl^ a!SgenrSV40. simian virus 40; ATPISI. adcnosmc 5'- 
(ythioltriphosphate; ssDNA, smgle-strandcd DNA. , 
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to homogeneity by four steps (K. Fien and B.S., unpublished 
results) and used in Fig. 3. RF-C was purified from human 293 
cell nuclear extracts through four steps (ssDNA-cellulose 
fraction; 60 or 40 ftg of RF-C per ml) or five steps (glycerol 
gradient fraction; 8 fig of RF-C per ml) as published (17). Pol 
S was purified from calf thymus (90 g) by five steps as 
described (8, 20) and the ssDNA-cellulose fraction with a 
specific activity of 6^.8 x 10^ units/mg was used. One unit of 
pol activity was defmed as the incorporation of 1 nmoi of 
dTMP at 3TC in 1 hr under conditions described in ref. 8. 

DNA Synthesis Reaction on a Primed M13 ssDNA with Pol 
6. The reaction mixture (25 containing 30 mM Hepes (pH 
7.8), 30 mM NaCl, 7 mM MgGIz, 0.5 mM dithiothreitol, 
bovine serum albumin (0.1 mg/ml), all four dNTPs (each at 
0.05 mM) with [a-«P]dATP (2000 cpm/pmol), 100 ng of M13 
ssDNA (MI3mpl8) primed with a 3-foid molar excess of a 
unique 17-base sequencing primer (primer 1211 from New 
England Biolabs), 200 ng of PCNA/1 /tg of RF-A and 
O.277O.54 unit of pbl 6 was incubated at 3T*e for 30 min, and 
acid^nsoluble radioactivity was determined. 

AWase AsMy. The assay for RF-C ATPase was essentially 
the same as described (21). The reaction mixture (25 fil) 
contmning50 mM of TrisflCl (pH 7;^^^ 0,1 M NaCI, 1 mM 
dithiothfeitbl, 2 niM MgCJI^; bbvirie scnim albumi (100 . 
ftg/ml), 0.1 mM [y-^^P]ATP^and the indicated DNAs was 
incubated at 37*C for 30 mm. . 

RESULTS 

RF-C Binds Specifically to a Piriiher^-Template DNA. RF-C 
ind 'PCNA Obbpesrate stimulate the piitatiye 

leading-strand'pol, i>ol 5 (8). As noted above., this suggested 
that they Vay be. similar to prokaryotic pol accessory pro- 
teins-. Sinco^RF-C bpuhd to DNA, the DNA binding activity 
of RF-C was vstudied by nitrocellulose fdter binding assays 
Vwth:'2p4abeied poly(dA) [a (template) ssDNA] orp6ly(dA> 
oligo(dT) (a priiner-teniplate DNA). As shown in Rg. lA, 
RF^ bound t04)9ly(dA)-oligo(dT) but not to p6ly(dA). The 
. binding specificity of RF-C was demonstrated by; testing the 
'. ability .of various unlabeled DNAs to act as; cohipetitdrs for 
RF^ binding to ;^?Mai^ IB). 
Under these 0hditions/=l^ to jssDNA catrying 

priniers but not to ^sp>I A^^ dpuble-isitranded f)N As oir to ah - 
RNA^DNA teriiplite^-i)rimer [poly(A)-Qligo(dt)l. this prim- 
er-^emplate^spocific DNA binding activity cosedimented in a 
glycerol gradient two other acUvitie of RF-G; the ability 
: td'St^mulate SV40-bNA and the.$timula->:; 

tiphtjfpol 5pnap)iinnied MiIs'sDNA in the presenccpf RF-A 
iuid A (Fig;^ As reported (17), the latter two activities ; 
cosedimented yathya i^^ contmmi^ poly- 

pe{jti<!|e$ with ^pju^nt molecular inasses of 140, 4t and 37 
i kDau'This anaiy^su^ests^^^^^ primcr-templatc binding 
activity is^diieift^^^ H not a cpn^iminant in the purified 
■prcpai;ati6n,; 'V'.-:v A'v- -7 7 ' -v';:'' 

RF-C Is on Ail'ase Stim 
and P(^A« Seyend.f^^ accessory 
proteins have lieejn repprted (2, 23-2i8) said, among these, the 
wellT^haracterib^ baCtciribphage T4 gene 44/62 protein com- 
plex has a spcc^ic primer^^ DNA binding activity (23, 
25);'M6reoyeri this pititein complex .4so contains a PNA- 
dependent ATPase activity thatis stimulated by the presence . 
of 3' ends on the DN A (27, 28). Thus, we tested the possibility 
that RF/C might' have a sinuiar ATPase activity. Fractions 
from the giyccrdl gradient"tfiat exhibit^ the specific primer- 
template bindiing activity also contained an AtTase, activity 
" wheri assayed in the preseifice Of poly(dA)-oligo(dT) (Fig- 2B), 
indicating that RF-C also functions as an ATPase. As shown 
in Figi 3A, highly purified RF-C has a low level pf DNA- 
indepehdent ATPase activity, but this activity is stimulated 
severalfold by either ssDNAs or double-stranded DNAs 




^i^FiGilj : pNA 

' poly(dA) (5 ir^cpcp^ pr '^P- 

61ig<Kcrni2^:(l:4^^ ng per reactiqal 

4ncubated,\vitK;^y^^ 

60 fi^ndj[:^i^^n^^^^ 

^ biif[ef^B§^ii^ 5 aissay but witK4P liMl 

Thc^maturf^^ through 

:ridiif^aii|y^^ 

;q>r^!C^|3q|^^(^>i^^ 

tc^ca<A\Ali^/5oC^ 
. l»iy(dA)«ii iig<)f RF^^ shw^Sife^ 

w^^ncttbic^^ 
: iiuctwffde) dfiom^ 

hitn^lluic^^ 

CQrtcspc^dt^te;^^ 
: pi|iy(<L^^^ (6i^y(A35!. 

. [j>ply(d^^ 

tiyclj^ifciitQKi^^^^ 
the:ssI3NXi|^^ 
^dc^dtpi^^ _ _ 

^ piiig«^!?l^fii^^ 
depchdenoe!^phr!p 



activity Of the phkjge T4 g'&ti^s 44/6 
., in addition to thfe stirnulaton^ *effe(:t of I^^^^ ' 
gene -^762 pr^^^^ 

lated by another <accessory .protbin» thfe-^en^i 
proteih (26-28). Since the phage 'R^i&ll^^^^ ^ 
PCNA do not bind directly to DNA but <^^qM^3^^ 
with their respective pols (2, 9), wc j)redicted thktT 
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^ti^A^S^rtTaseicti^i^^^^ RF-C to* glycerol sradicnt. 

^-I^Sted^Sto iS^iiil glycerol gra&pnt m buffer 

I^SSi^Sl M NaCSl is aesSibed a7). After centnfu^mi in a 
|3t\^S^o^ ar49.<toO rpm:it 4-G.for 24 J>r.>38 ft»cli«>ns were 

^Siw to aii^iel gnuliwt were catidascm "^^^^Jl 
pSS^hvd&aie a.4 S.fl«ti^^ 16. bovine seromBlbumin 
I'SmS Sri 24;'ovalbuimta (3.7 S) at^ 26. and cy- 

wasWparated in an NaDodSO^/polyacrylanBde gel 02;^ 
^iiidDiSeins wwe staiiicd wthiavCT. Molceular mass maiiers 0^) 
iSJSSr^riglU. {B) Stimulaticmof pd ^P|^*i^ 
^"i«n^ in the nresence of RF-A and PCNA. and DNA-dependcnt 
^i^SS^SiwS"nSwith2&HKHfcly(dA)^o^^^^ 

<^d[Scd>sing.4:^ of «ch I^NAMndepen- 
^Sf-ATPasc activity vras also measured in firactwns 14 and 16 and 
.J^S^KSu^obtainedwltlffiNApresentO^ 
|''SrcSlementaao;( assay for RF^ in.SV4P DNA reph«^on «. 
^Xlnd DNA binding to a primer-template. The Rf-C ~"nP'«- 
iJ inentation was done m a 50-jil reaction raixtme contaimng 350 Mg of 
JS. I*T7>g of SV40 tumor antigen. 200 ng of topoisomer- 
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would be analogous to the phag^ lf4 gene 45 protein and 
would samulate the RF-C DNA-dcpcndent ATPase This 
was tested by addition of various amounts of purified PCNA 
into the RF-C ATPase reaction in the presence of pnmer- 
template DNA. The DNA-dependent ATPase activity of 
RF-C was Simulated up to 4-fold in the presence of saturating 
amounu of PCh| A (Fig. 3B). In the absence 'of RF;£. highly 
purified PCNA revealed a background level of ATPase. so 
tiiat it is possible that this niinor ATPase observed in the 
PCNA preparation is greatly stimulated by. interaction with 
RF-C. But the PCNA-stimulated ATPase activity displayed 
the same specificity for Uie varioiis DNAs a§ the RF-C - 
ATPase activity shown in Fig. SA (data not shown), suggest- 
ing that most 6f the ATPase activity observed is intrinsic to 
the RF-C polypeptides. Therefore, this result demonstrates 
an inteiai^bn between RF-C^wd PCNA sinv lar . to that 
bbseVvcci betWeem 'tiie phage T4 gene 44/62 complex and the 

byiWiJ.InthephageT4pNAreBlicaUon8>^^^^ , 
of ATPordATPby theaccessbiyptwtewis Is itquired for the 
fortiationbf a stable imtiatioricqin^^^ . 
ptt)teins and p6i bound to the J'-fcnd pf the DN A pnmer ^. 
23 2^) If RF-CandPCJ4AarefuiicQ9P?aiya^ ™e 
f ptege T4 gene 44/62 protein wmpleHi^tf gene 45 Pro^einj 
iesSxtively, stimulation of pol S actiyjty on pnmcd M13 
ssDNA by RF-C and PCNA should re^iuirc hydrolysis of 
ATt':Asreportcd(8),DNAsyntij«^^^^ 
ssPNA was stimulated abotit lrfold by RF-C and JTCNA 
compared to incorporation by pbl 8 alone. If .1 mM ATP was 
added to tiiis^ieaction. tiie PNA synUiesis was further 
stiifeuWed abbiit>fold :(Fig.,4^ ATP-independent pro- 
- ^essiveDNA syhtiiesis by ppl S in^^^^^ - 
PCNA inay: have been dUe to tiie utilization of dATT instead 
dfAtP', as hasbc(sn report^ iiip^^ ^y^^!^"i2h " 
-V&the^ repli#i6S proteins. RFrA.Rf^. "jj PCNA. were 
added to reactions contaimng pol 8 and pnmed M13 sspwA. 
addition of AT*;had no effect (Fig. 4). This is probably due 
" to niajdmal stimulation of pol 8 byj^^^ Mcessory 
proteins and dATP. We have demoiBtrated that under these 
OTriditibns, DNA syntiiesisby p^^^^ 
labv^eveh AtP :(or dATP) :hydro^^^^^ th«s 
stimttlatibnbflkil Sbecause^ to^ of 1 mM adenosine 

. : 54f tiiiditnphbsiihatc (ATPlSj)ra lion-hydrolyzable^ana- 
iome of AtT!» to^^the reaction! abohshed the 

stiiriUl^n tf proccssiv^ PN A synthesis by the ^^^^ 
iicceSsory prtteins(Fig. 4>;^[gdid not mhibitthe small 
amount of synthesis obtained>yitb pbl 8 plus PCNA, sug- 
'gesting tiiat the effect of ATPIS] was RFf -dependent and 
not tiie result of OTmpetitivcinhiW^^ dAMP incorpo- 
ration. Therefore; these resdltsdempnsUiited thatpirocessive 
DN A siTith^ by pol fi on primed spN A i-eq^res t^ 

i^SS^I^tiiisWh^^^^^ 
been report^ in phafec T4 system. 



- DlSCySBION 

These bioclwriiical analyses cjiariy iiidicated that R^-C and 
P(a>l A are ftonctionally analogous to th§ phage T4 gene «/6Z 
protein cbmplexand the gene«protciit, respectively. Itwas. 
therefore, of interest to deterinine whetiier there were any 

asc 1. 90 ng of topoisomerasc II. 300 ng of pSVOlO. andS /il of each 
fraction M described (15). After the incubation at 37X for 1 hr. 
acid-insoluble cpfti were measured. The pnmer-template DNA bind- 
ing activity was measured as described in the legend to Fig. 1 with 
3 ul of eadi firactiori. but in the pn?sence of 50 mM NaCl. DNA 
binding activity using poly(dA) was also tested in pandlel wijh the 
: same fractions: however, no DNA binding was detected (data not 
shown). 
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stmctural simUarities bety^^ only a com- 

parison between human Pa^Xtl9)^_a^^ gene 45 protein (30) 
is possibly at present. A comFriiter s&irch of their primary 
amino SK:id .sequence reyeal^^ regional similarities 

(31t50% siinilarity) that could be aligned in a linear arnirige- 
inent, alb^itwith seyieral g^ each amino acid sequence 
(Fig. 5). pe fact; t^^^^ span the length of the 

aniino acid sequences and thkt these proteins are functionally 
equivalcnt;;^uggests to are evoiutionarily re- 

lated and thdt the jdenticalresidues are important for function 
in both proUtoi^ 

The m^W r^pUcitiw^^ (pol III) also associates 

with multiple subuhitsvand reiquires hydrolysis of ATP for 
prbciessive DNA synfficsis a, In this case, the y65' 

<»mplejt tmd thfe^^l^ii^^ III holoenzyme seem to 
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Fig. 3. ATPasc activity of RF-C (A) 
TiUation of RF-C without DNA (a) or with 
26 mM poIy(dA) (□), 26 /iM adenovirus DNA 
digested with HincU (o), or 26 uM 
poly(dA)-ongo(dT) (1:4 molar ratio) (•). 
(Note that all the DNA concentrations are 
expressed as /aM of nucleotide.) A 25-/d 
reaction mixture containing the indicated 
amounts of RF-C was incubated with DNA 
and[>^^PJATPandthercIea5edPifromATe 
was determined. (B) Addition of PCNA to 
RF-C ATPkise assay .Jncreastng amounts of 
PCNA were incubated in the reac^oa niix-:; 
ture contaimng 26 jM poiy(dA>blig6(<rrj > v 
wiUi (•) and without (Q) 24 ng of 
the amount of released Pj was measured >- -^^^^^^^ 
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Fio: 4. Tune<ptuse DNA syhtilesis with pol 6. The reaction 
mature (25 Ml>qontained;fl:M uifit of jpol d (8)1200 ng of PCNA, 40 
of RF-C, and, where indicated. (650 ng of RF-A, 1 mM ATP/or l 
mM ATP[S], Samples (3 Ml) were; withdrawn at the indicated times 
and acid-insQluble radioactivity was m;easured. Values are shoWn as 
the incorporation of dAMP i)cr>:^ )^ of the reaction mixture^ 
Opmponcnts used wcrcltNAand RF<:(A);I^NAiR^ and ATP 
U); PCNA, RF-C. and RF-A (a); PCNA, RF-cT RF-A/^d ATP (•); 
^NA, RF<;, RF-A, and ATP[S]Xp). Experiments witii RFfC and 
rcNA (with and wiUiout ATP) were done separately fronj other 
experiments. Experiments witii RF-C, PCNA, and ATP[S] or pol 5 
aldne were also done in parallel and yielded dmost the same 'results 
shown by the open circles (data not shown). 



. function similariy ta RF^J-an^^ PGNA, respectively. _ 
subunit was proposed iC91^o^^ similarto POJ^C 

but the amino acid sequence' deduced from its cbdiiig:^^ 
quence, the dnaN gene,*revfealcd little significant siniilariiS^^'' 
witii human PGNA (data noVshQ))ra). Thus; 
be a particular fuhctipnal and Structural relationship betvi^ieen^ 
the phage T4 and rnanunaliaii oeU replication coihponent^;^b^ 
Indeed, the T4 gene ^5-enco<ied protein stimulated theRF:^! 
ATPase activity (unpublished resu^^^ ^ -'k^f^ 

As mentioned above, twd pglsV^^a and X are irivolVelilf^ 
eukaryotic DNA replt<iti6ti; ah^ iri tifje yeast ^accAoriS 
cerevisiae pols l and Illcprttspiorld pols a and jS; 
tiveiy (6, 7, 33), Genes c^ingforhiii^ ppli(;r an^^ 
I and III have been i^lale^C^ 
sequences demonstrated &e .{present 6^^^^^ 
gions that rnaiy cdrresFKih<|:^^^ 
34). It is strikingthat bacten6j[ifi^e t4 pol alsbl^^ 
class of pols by sequctfce\^iriuiaiity^(^^ 
coli po\ III has almost rib: simU^ 
may have evolved further fii^ the 
likely that during development offe^ 
gene of tiiephage T4 type diipU^ di vergcSl j^^ 
pol a and pol5 types. Thi^'itujiy 
ac<^ssory protein^^iiiia^i t^^ 
evolution; howey^; fiijeifil^ 
complex functions with pel ;5> pi^bafi^ci3€^^ 
be the processiye le^ing-strtih 
a, which does H6t afipcartb iiitei^^^^^^ 
interact witii RFklknd^ 

dent of the accessory ph)tcm^^ 

kept tiic interactibii With - my^A^iiM^Mt 

cause it fiinctiohs as^^i^^^^ 
(36) have demohstirated thatt^^^^ 
by gene 61, is orily reqbiri^ Ifbria^ 
tioh and not for leadki^lsiii^ 

Fractionation ;pf factors i^ir^for^^ 
. tion i/i vitrd has beien ,a^^i^^ 
replication compbnehtsfr^ 
were identified ^as axbei^rjSip 
undoubtedly involved direcUy jn c«llii^ 
however, other repjication cbmponetits reo^il^^^ 
and characterized. ' • - ... 

A remarkable sinrularity bietWeeh' i^ 
eukaryotes has been iiqted; The: gtpuM m 
some phage T4 raRNAs are related to thc'm'u^^ 
present in some eukaryote RNAs (37)/ $l^b 
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Fig. 5. Comparison of hiimari PCN^ 
(HPCNA)"and phage T4 protein 
(GENE4^. (A) Amino acid s^tl^nces were 
obtained from refs. 19 and 30. .Hoinb!ogics be- 
tween human PCNA and gene 45 protein were . 
delected using the align program (31) and man- 
ual inspection. Identicd.uninQ CM:ids are boxed 
and conserved amino acid^ar^ indicated witH a 
dot, (J?) Regions of homology^betwcen the. two 
proteins are indicated and thefpunibers iiidicate 
percent identity iand percent 'conservation (in 
parentheses) between them with respect to the 
human PCN A seiquence. The single-letter amino 
acid code is used. 
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The genes encoding the 6 and 6' subunits of the 10- 
subanit Escherichia coli replicase, DNA polymerasem 
holoenzyme, have been identified and sequenced. The 
holA gene encoding 5 is located downstream of rlpB at 
15.2 min and predicts a 38.7 kda protein. The holB 
gene encoding 6' is located at 24.3 min and predicts a 
36.9-kDa protein. Hence the 5 and 6' subunits are un- 
related proteins encoded by separate genes. The genes 
have been used to express and purify 6 and 6' in quan- 
tity. The predicted amino acid sequence of 5' is homol- 
ogous to the sequences of the r and 7 subunits revealing 
a large amount of structural redundancy within the 
holoenzyme. 



DNA polymerase III holoenzyme. the replicase of Esche- 
richia coUt is composed of 10 subunits (a, €, 6, t, 7, 6, 5', Xi if'i 
P) (1). The holoenzyme^ is fast (750 nucleotides(s)) and highly 
processive (>100 kb) in synthesis (1-4). This speed is approx- 
imately the observed rate of replication fork movement in E, 
coli (5), and the high processivity is consistent with the single 
origin from which two opposed replication forks travel around 
the 4-Mb circular chromosome. The three-subunit poUII core 
subassembly (a, DNA polymerase; «, proofreading 3'-6' exo- 
nuclease; and 6) is slow (20 nucleotides(s)) and only processive 
for approximately 11 nucleotides (2). Speed and processivity 
are conferred onto the core polymerase by the 7 complex (7, 
6, 6', X. and ^) P accessory proteins (6-8). Biochemical 
and x-ray studies show that fi is shaped like a ring to encircle 
and slide along DNA (9-11). The 7 complex couples ATP to 
assemble the p ring around DNA (12). The /3 slidmg DNA 
clamp also binds directly to a. thereby tethering the core 
polymerase down to DNA for highly processive synthesis (9). 
We have a relatively firm understanding of the function of 
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the a (dnoE), c (dncQ), and /3 (dnaN) subunits because they 
are available in quantity through use of their genes. The only 
other known holoenzyme subunit gene is dnaX, which encodes 
both T (71 kDa) and 7 (47 kDa); 7 is a truncated version of t 
due to a translational frameshift (13-15). The carboxyl-ter- 
minal sequence of t provides it with an ability to bind together 
two molecules of polIII core, presumably for simultaneous 
synthesis of the leading and lagging strands (16, 17). By itself, 
7 cannot assemble P onto DNAvl f^r^this the 5 subunit is also 
required, and the 5', x. and \f sutjunits stunulate the reaction 
(12, 18). To gain further insight into the functions of 5, 6\ x» 
and ^, their genes must be identified for genetic studies, and 
the proteins must be produced in quantity for biochemical 
studies. 

In this series of reports the genes encoding 5, 6\ Xi and yp 
of the 7 complex are identified and also the gene encoding 6 
of poUII core, the last remaining gene of the holoenzyme. We 
have named these genes hoi, for holoenzyme (suggested by 
Dr. Kenneth K. Marians, Sloan-Kettering), and the letters 
A-E in order of descending molecular mass (5, 8\ x* ^» and 5, 
respectively). We have sequenced the genes, cloned them into 
expression vectors, purified the subunitis, and have initiated 
studies of their structure and function. 

We begin the series with the identification of holA and holB 
and purification of 5 and 6' in quantity. The amino acid 
sequence of h' is homologous to the sequence of the 7 and t 
subunits, revealing a much greater structural redundancy in 
this multien?3ane complex than previously recognized in only 
the relationship of 7 to t. In the second report, the physical 
and functional interactions of b and 5' are characterized alone 
and with other holoenzyme subunits (19). - 

EXPERIMENTAL PROCEDURES 

E, coli Strains, Bacteriophage, and DNAs 

HBlOl (pNT203, pSKlOO) (20) used to purify sufficient 5 and S' 
for microsequencing was the ^ of Dr. Arthur Romberg (Stanford 
University). Strain BL21(DE3) plysS and pET3c (21) were ^fts of 
Dr. F. William Studier (Brookhaven National Laboratory). X15D7 
(169) and XE9G1 (236) were gifts of Dr. Yifli Kohara (National 
Institute of Ctenetics, Japan), pUC18 and M13mpl8 double-strand 
DNA were from Bethesda Research Laboratories. M13mpl8 ssDNA 
was purified as described (9). All cell growth was in LB medium umng 
ampicillin at 100 fi^rai and chloramphenicol at 30 fig/mi where 
needed. X phage was prepared as described (22). Buffer A is 20 mM 
Tris-HCl (pH 7.5), 20% glycerol, 0.5 mM EDTA. 2 inM DTT. Buffer 
B is 30 mM Hepes-NaOH (pH 7*2), 10% glycerol, 0.5 mM EDTA, 2 
mMDTT. 

Microsequencing 

The 5 and 6' subunits were purified through the ATP-agarose 
column step (18) from 1.3 kg of HB101(pNT203, pSKlOO). 6 and 5' 
were separated on a 13% SDS-polyacrylamide gel wherei3>on 6' 
resolved into two bands as noted previously (18). The slower and 
faster migrating 6' bands will be referred to in this report as 6'l 
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(large) and 6's (small), respectively; 6'q was approximately two times 
the abvmdance of 6'l by amino acid analysis. 6, 5'u and 6's were 
electroblotted onto polyvinylidene difluoride membrane (Whatman) 
as described (23) for amino-terminal sequencing (50 pmol each) and 
onto nitrocellulose membrane (Schleicher & Schuell) as described 
(24) for tryptic analysis (140 pmol of 6, 90 pmol of 180 pmol 
of 6'e)* Proteins were visualized by Ponceau S stain (Sigma), and 
their sequences were determined by William S, Lane of Harvard 
Microchemistry. Sequences of 6 were: 5-amino terminus, NH2- 
MLRLYPEQLRAQLNEGLRAAYLLLGNDP; tryptic peptides: 5-1. 
NHrAAYLLLGNDPLLLQESQDAVR; 5-2. NHj-AQENAAWFTA- 
lK^«-3. NHa-VEQAVNDAAHFrPFHWVDALLM(G)(K). Se- 
ouences of 5'8 were: 5'-amino terminus, NH2-MRWYPPL(R)(P)- 
DFEKLVA; tryptic peptides: 5'-l. NH,-EVTEKLNEHAR; 5'-3. 
NH.-VVWVTDAAIXTDAAANALUC; 5M, NHrTLEEPPAETW- 
FFLATREP(E)(R)LLAT(L); 5'-5. NH,.LHYLAPP(P)EQyAVT. 
(W)LSR: 5'-6, NHa-LSAGSPGAALALFQGDNWQAR, Sequences of 
tryptic peptides of 5'l were: 5'-2. NH,-LGGAK; 5'-7 (same as 5'-3). 
NHs-VVWVTDAALLTDAAANALLK. Parentheses indicate uncer- 
tain assignments. 

Identification of holB 

Two synthetic oligonucleotide probes (DNA oligonucleotides, Oli- 
gos Etc. Inc) were designed from the sequence of two. of the tryptic 
peptides and the codon usage of E. coU (31) with allowance for a T- 
G misnair at the wobble position. A synthetic DNA 67-mer b aaed on 
?-4wL5'-ACTCTGGAAGAACCGCCGG(3TGAAACTTCGTT^ 
TTCTGG(3TACTCGTGAACCJGGAA-3' (after identification and se- 
quencing holB this probe was incorrect at 11 P<>sitions)^DNA64- 
Ter & on 5 '-6 was 6'-GCTGGTT(yrcCGGGTG(mKnCTG. 
GCITCTGTTTCAGGGTGATAACTGGCA(lG(rr-3' (tiie sequence 
of holB showed this probe was incorrect at 9 positions). These probes 
(100 pmol each) were 5' end-labeled with 1 fiU t7-"P]ATP (radion- 
ucleotides, Du Pont-New England Nuclear) and polynuclwUde ki- 
nase E coU genomic DNA (strain C600) was extracted as described 
(26) "and restricted with eitiier BomHI, HmdIII. EcdBl, EcoRV, m. 
Kpnl Pstl, or PvuU (DNA modificatioiufinzymes. New England 
Biolfiis), and then each digest was electrophorescd in a 0^% native 
agarose gel followed by Southern analysis (22) using either the 57- or 
54-mer as a probe. Initially we used a hybridization tcnq>eratoe of 
42 and then washed the filters with 2 X SSC and 0.2% SDS at 
53 'C. Autoradiography showed a single band in each lane for the 57- 
mer. The 54-mer showed two bands in each lane, bttt one band always 
matched the position of the band probed with the 57-mer. 

DNA Sequencing 

For the holA gene, the 3.2-kb KprHIBgia fragment containing /wJA. 
was excised from X15D7 (169) and directionaUy ligated into pUClS 
to yield pUC-5. For hoJB, the 2.1-kb |^/iI/JScoRV fragment contam- 
ine the holB gene was excised from XE9G1 (236) and directionaUy 
uAted into pUClS {Kpnl/Hincli) to yield pUC-5'. Botii stiands of 
holA and holB were sequenced by the chain termination method of 
Sanger et aL (26) using the U. S. Biochemical Corp. Sequenase kit. 
a-"S-dATP, and synthetic DNA 17-mers. 

Overproducing Plasmids 

p£r-5— Approximately 1.7 kb of DNA upstream of hoiA was 
excised from pUC-5 using Kpnl (polylinker site) and BsGO. (13 bp 
upstream of the start codon of holA) foUowed by blunt end formation 
using Klenow polymerase and recircularizatipn of tiie plasmid usmg 
T4 DNA ligase. A 1.6-kb fragment containing holA was then excised 
' using EooRl and Xbal (these sites are in tiie pUCl8 polylinker on 
eitiier side of the holA insert) foUowed by directional ligation mto 
M13mpl8 to yield M13-5. An Ndel site ^as generated at tiie start 
codon of hoZA by the primer-directed mutagenesis technique of Kun- 
kel et oL (27) using a DNA 33-mer (5'-GTACAAG(3GAATC^r^ 
TTACCCAGCGAGC5TC-3') contfidmng the Ndel rite (underiined) at 
the start codon of holA and using DNA polymerase m holoenzyme 
and SSB to replicate the circle completely without strand displace- 
ment as described by O'Donnell and Komberg (28). An Ndel fragment 
(2.1 kb) containmg holA was exicised from the Nitfel-mutated M13-5 



(M13-5Nd.) and ligated into pET3c. linearized using Ndel, to yield 
pET-5. The orientation of holA in pET-5 was determined by sequenc- 
ing. 

pET-h^—k 2.1-kb Kpnl/Hin^lll fragment containing holB was 
excised from pUC-5' and directionaUy ligated into M13mpl8 to yield 
M13-5' An Ndel site was generated at the start codon of holB using 
a DNA 33-mer (S'-GGTGAAGGAGTTGGACATATGAGATGGTA- 
TCCA-3') containing the Ndel site (imderlined) at the start codon of 
holB as described above for holA, An Ndel fragment (1,160 bp) 
containing holB was excised from MIS-S'na, and ligated into pETSc 
to yield pET-5' as described above. 

Replication Assay for 5 

The 5 replication assay contained 72 ng of M13mpl8 ssDNA (0.03 
pmol as circles) uniquely primed witii a DNA 30-mer (9), 980 ng of 
SSB (13.6 pmol as tetramer), 22 ng of |3 (0.29 pmol as dimer), 200 ng 
of 7 (2.1 pmol as dimer). 65 ng of a€ complex (0.35 pmpl) in a final 
volume -(after the addition of proteins) of 25 /J of replication assay 
buffer (20 mM Ite-HCl (pH 7.5); 8 mM MgCU; 6 mM DTT; 4% 
gJyceroU 40 ^g/ml BSA; 0.5 mM ATP; a 60 nM concentration each of 
d(TrP, dGTP. and dATP; and 20 nU [a-^'PldTTP). Proteins used in 
the assay were purified as described (17). 1-5 ng of 5 (or coluinn 
fraction) was added to the assay on ice, shifted to 37 'C for 5 min, 
and then quenched and quantitated using DE81 paper as described 
(29). When needed, proteins were diluted in buffer A containing 60 
mM NaCl and 50 iig/wl BSA. 

Replication Assay for 5' 

The 5' replication assay contained 108 ng of M13mpl8 ssDNA 
(0.05 pmol as circles) primed with a DNA 30-mer (9), 1.5 fig of SSB 
(21 pmol as tetramer), 30 ng of (0.39 pmol as dimer). 22.5 ng of ae 
complex (0.14 pmol). 20 ng of 7 (0.21 pmol as dimer). and 2 ng of 5 
((0.05 pmol as monomer) in a final volume of 25 /J of assay buffer 
lacidng dATP and dTTP. 1-5 ng of 5' (or column fraction) was added 
to the assay on ice then shifted to 37 '0 for 8 min to allow assembly 
of the processive polymerase. DNA syn thesis was initiated upon rapid 
addition of 60 mM dATP, 20 /xM [a-^PJTTP and then quenched after 
20 8 and quantitated using DE81 paper as described (29). When 
needed, proteins were diluted m buffer A containing 50 Mg/ml BSA. 

RESULTS 

Identification ofholA—Tlie sequence of the amino-terminal 
28 amino acids of 5 and three internal tryptic peptides were 
determined. One of the tryptic peptides (21 amino acids) 
overlapped 10 amino acids of the amino-terminal sequence, A 
search of the translated GenBank (30) revealed an exact 
match to the 21-amino acid tryptic peptide which overlapped 
the amino-terminal sequence. The matching sequence oc- 
curred jtist downstream of the rIpB gene at 15.2 min (688 kb 
startingfrom thrA) of the E. coli chromosome (32). The match 
to the amino-terminal sequence of 6 was imperfect becau3e of 
a few errors in the published sequence of this region.' The 
published sequence information downstream of rlpB ac-. 
counted for approximately 22% of the 5 gene and did not 
encode the othertwo tryptic fragments. 

A 3.2-kb Epnl/Bgni fragment containing holA was excised 
from Kohaxa phage X15D7 (169), cloned into pUClS, and the 
holA gene was sequenced (Fig. 1). The holA sequence encodes 
the correct amino terminus of 5 and all of the tryptic peptides 
in the same reading frame {underlined in Fig. 1). Overall, 
holA encodes a 343-anuno acid protem (pi = 6.99) of 38,704 
Da. consistent with the mobihty of d in SDS-polyacrylamide 
gels (20). The termination codon of the rlpB gene overlaps 
the initiating ATG of holA. It is not known whether holA is 
in an operon with r{pB or a downstream gene. One nucleotide 
beyond the holA stop codon is a possible ATG initiation codon 
for an open readmg frame throughout the rest of the known 



' We cloned and sequenced both genes to which the 54-mer probe 
hybridized. One was holB, the other contained 16 nucleotides in a 
row which exactly matched the 54-mer, but there was no amino acid 
homology to 6' in the sequence surrounding these 5 amino acids. 



•The previously published sequence (32) encoding the first 54 
nucleotides of holA (Le, the amino-terminal 18 ammo acids of 6) 
differed by 11 nucleotides from the sequence presented here and was 
a particularly difficult stretch of DNA to sequence. 
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-35 -10 

CCGAACAGCT GATTCGTAAG CTGCCA AGCA TCCCTGCTGC C CATATT CGT TCCGACGAAG AACAGACGTC 70 



GACCACAACC GATACTCCGG CAACGCCTCC ACGCCTCTCC ACCACGCTCG GTMCTCATG ATT CGG TTG 



139 

"TTT iipB I Tmec iU arg leu (4) 
stop Delta 

TAC CCG GAA CAA CTC CGC GCG CAG CTC AAT GAA GGC CTG CGC GCG GCG TAT CTT TTA CTT 199 

tyr pro glu gin leu arg ala gin leu asn glu gly leu arg ala ala tyr leu jjeu^jeu (24) 

N-tcnninal analysis 

GGT AAC GAT CCT CTG TTA TTG CAG GAA ACC CAG GAC GCT CTT CCT CAG CTA OCT GCG CCA Z^V 
gly asn aap pro leu leu leu gin glu aer gin aap ala val arg gin val ala ala ala (44) 
6-1 peptide 

CAA GGA rrC GAA GAA CAC CAC ACT TTT TCC ATT GAT CCC AAC ACT GAC TCG AAT GCG ATC 319 

gin gly phe glu glu hie hie thr phe eer ile aep pro asn thr aop trp asn ala iXe (54) 

TTT TCG TTA TW: CAC GCT ATC ACT CTG TTT CCC ACT CGA CAA ACG CTA TTG CTG TTG TT^ 379 

phe eer leu eye gin ala met oer leu phe ala oer arg gin thr leu leu leu leu leu (84) 

CCA GAA AAC GGA CCG AAT OCG <k:G ATC AAT GAG CAA CTT CTC ACA CTC ACC GGA CTT CTG 439 

pro glu aen gly pro asn ala ala ile aan glu gin leu leu thr leu thr gly leu leu (lOi) 

CAT GAC GAC CTG CTG TTG ATC, CTC CCC GGT AAT AAA TTA AGC AAA GCG CAA GAA AAT CCC 499 

hie aop aep leu leu leu ile val arg gly aon lya leu aer lye ala gin glu aan ala (124) 

GCC TGG TTT ACT GCG err GCG AAT CGC AGC CTC CAC GTC ACC TGT CAG ACA CCC CAC CAG 559 

ala trp phe thr ala leu ala asn arg oer val gin val thr eye gin thr pro glu gin (144) 

GCT CAG CTT^C^CGC TGG CTT GCT GCG CGC GCA AAA CAG CTC AAC TTA GAA CTG GAT CAC 619 

ala gin leu pro arg trp val ala ala arg ala lye gin leu aan leu glu leu aap aop (164) 

GCG GCA AAT CAG GTC CTC TXK: TAC TGT TAT GAA GGT AAC CTC CTG GCG CTC GCT CAC GCA 679 

ala ala aen gin val leu eye tyr eye tyr glu gly aan leu leu ala leu ala gin ala (184) 

739 
(2(Vl) 



CTC GAG CCT TTA TCG CTG CTC TCG CCA CAC CGC AAA TTG ACA TTA CCG. CCC CTT GAA CAC 
leu glu arg leu eer leu leu trp pro asp gly lya leu thr leu pro arg val ylu yln 

GCC GTC AAT^GAT GCC GCG CAT TTC ACC CCT rTT' CAT TGG CTT GAT GCT TTG TTG ATC GGA 799 
ala val aen aap ala ala hie phe thr p ro phe hia trp val aap ala leu leu met gly (224) 

6-3 peptide oen 

AAA ACT AAC CGC GCA TTG CAT ATT CTT CAC CAA CTC CCT CTG GAA GGC ACC GAA CCG CTT 

lys aer lye arg ala leu hie ile leu gin gin leu arg leu glu gly oer glu pro val (244) 

ATT TTG TTG CGC ACA TTA CAA CGT GAA CTG TTC TTA CTG err AAC CTG AAA CGC CAG TCT 919 
ile leu leu arg thr leu gin arg glu leu leu leu leu val aan leu lye arg gin ser (264) 

979 
(284) 



1039 



GCC CAT ACG CCA CTG CCT GCG TTG TTT CAT AAC CAT CCG CTA TGG CAG AAC CGC CGG CGC 
ala his thr pro leu arg ala leu phe aap lye hie arg val trp gin aen arg arg gly 

ATC ATC GGC GAG GCG TTA AAT CGC TTA ACT CAG ACG CAG TTA CCT CAG GCC CTG CAA CTC 
met met gly glu ala leu aen arg leu oer gin thr gin leu arg gin ala val gin leu (304) 

CTG ACA CGA ACG GAA CTC ACC CTC AAA CAA GAT TAC GCT CAG TCA CTG TGG GCA GAG CTC 
leu thr arg thr glu leu thr leu lya gin aap tyr gly gin ser val trp ala glu leu. 

GAA GCG TTA TCT CTT CTG TTG TCC CAT AAA CCC CTC GCG GAC CTA TTT ATC GAC GCT TCA 1159 
glu gly leu oer leu leu leu eye his lye pro leu ala aep val phe ile aap gly * (343) 

TATGAAATCT TTACAGCCTC TCTTrCGCGC CACCrTTCAT CCGCTGCACT ATCGTCATCT AAAACCCCTT 1229 . 
GGAAGCGTGG CCGAAGTTTr GATTGGTCTC AC 

Fig 1 Sequence of holA encoding «. DNA sequence of holA (upper case) and the predicted amino acid sec^ence of \(^ower^) 
Dp^rta a acid protein of 38:704 Da. The amino-terminal and tryptic peptide sequenc^ obtained from the nat«iaUy punfied 5 

^^^^^S^d The Btbp codom of rlpB and holA are marked with asterisks The putative RNA polymerase promoter signals (-35, 
^!lKdsSD5&o (S.D.V8equence underlined. Numbering of the nucleotide sequence is presented to the nght. Numbering of the 
amino acids of 6 is shown in parentheses to the right, 

sequence (101 bp), suggesting a possible gene immediately The ainino-terminal repob of « (aimno acids 17-^) con- 
foUowine holA. The nearest possible initiation signals for tarns a possible leucme zipper (LeuXeLeuXO^uXiVal) al- 
transcription and translation of the holA gene are underlined though a proline between the second and third leucmes may 
in Fie. 1: the match to their coiisensus sequences is not strong, disrupt the required a-heh^ structure. Of the 33 figiiune 
suggesting a low utilization efficiency. Inefficient transcrip- and lysine residues m 6. 16 (50%) are wit^ ammo acids 225- 
tion and/or translation may be expected for a gene encoding 307. This same region contains only 5 (14%) of the 35 glutamic 
a subunit of the holoenzyme present at only 10^20 copies/ceU and aspartic acid residues. Whether thw concentrati^ of 
(33) The holA gene uses several rare codons (CCC(Pro), basic residues is significant to function is unknown. There 
ACA(Thr), GGA(Gly), AGT(Ser). AAT(A8n), TTA. TTG. are no mc finger or helix-turn-h^^^ 
CTC (all Leu)) two to five times more frequently than average ATP bmdmg to 5 has been detected previously by a UV 
(31) which may decrease translation efficiency. cross-linkmg study of the holoenzyme (34). The sequence of 
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Fig. 2. Position of holB on the Kohara map. Section of the Kohara restriction map in the vicinity of 24 min. Restriction enzymes are 
indicated to the right. Shaded fragments indicate those detected by Southern analysis. The position of overlap places holB at 24.3 min. The 
direction of holB is based on the relative positions of the Kpnl site to the EcoRV site in the Kohara map and the known direction of holB 
within this fragment by sequence analysis. The correspondence between the observed fragment sizes with those on the m^ (observed versus 
map) were (in kb): BamHI. >25 versus 38; HmdIII, >20 versus 30; EcoBl, >15 versus 16.2; EcoRV, 7.0 versus 6.8; BgH, 4.2 versus 4.2; IQjnI. 
6.6 versus 6.4; Pstl, 1.7 versus 1.9; PuuII, 6.2 versus 6.2, 



AAGAATCTTT CGATTTCTTT AATCGCACCC GCGCCCGCTA TCTOGAACTG GCAGCACAAG ATAAAAGCAT 70 
TCATACCATT GATGCCACCC ACCCCCTOGA GOCCGTCATC GATCtCAATCC GCACTACCGT GACCCACTGG 140 



TOGACGCATG AGA TOG TAT CXTA TGG TTA CGA CCT GAT TTC CAA AAA CTG GTA 
met arg trp tyr pro trp leu arg pro asp phe glu lys leu val 
I ^ Delta prime N-tennlnat analysis 

GGC AGC TAT CAG CCC OGA ASA OCT CAC CAT GCG CTA CTC ATT CAG GCC TTA CCG GGC ATG 
a la. ser tyr gin ala gly arg gly his his ala leu leu lie gin ala leu pro gly met 



GTGA AGQAG T 

S,D. 



GGC GAT GAT GCT TTA ATC TAG GCX: CTC AOC CGT TAT TTA CTC TGC CAA CAA CCC CAC GGC 
gly asp asp ala leu ile tyr ala leu set arg tyr leu leu cys gin gin pro gin gly 

CAC AAA ACT TGC GGT CAC TGT CCT OGA TCT CAC TTC ATC CAC GCT QGC ACG CAT CCC GAT 
his lys ser cys gly his cys arg gly cys gin leu met gin ala gly thr his pro asp 

TAC TAC ACC CTG GCT CCC CAA AAA OGA AAA AAT AOC CTC OGC GTT GAT GCG GTA CGT GAG 
tyr tyr thr leu ala pro glu lys gly lys asn thr leu gly val asp ala val arg .glu 

CTC ACC CAA AAG CTC AAT GAG CAC OCA CCC TTA GGT OCT CCG AAA GTC GTT TGG GTA ACC 
val thr glu lys leU~Ssn glu hie ala arg.Jeu gly gly ala lys.^val val trp val thr 

8'-1 peptide 6-2peptlde 

GAT GCT GCC TTA CTA ACC GAC GCC GCG GCT AAC CCA TTC CTC AAA ACG CTT GAA GAG CCA 
asp ala ala leu leu thr asp ala ala ala asn ala leu leu lys., thr leu glu glu pro 



8^*3 peptide" 

5 TTT TTC CTC 



CCA OCA GAA ACT TGG 
pro ala glu thr trp phe ph< 



CTC OCT ACC CGC GAG CCT GAA OCT TTA CTC CCA ACA TTA 
le leu ala thr a r^ lu pro glu arg leu leu ala thr. leu 



peptide 

CGT ACT CGT TGT COG TTA CAT TAC CTT GCG CCG CCG CCG GAA CAG TAC GCC GTC ACC TGG 
arg ser arg cys arg , leu his tyr leu ala pro pro pro glu gin tyr ala val thr trp 

5'-^ peptide 

CTT TCA CGC GAA GTC ACA ATC TCA CAG GAT OCA TTA CTT OCC CCA TTC CGC TTA AGC GCC 
leu ser arg, glu val thr met ser gin asp ala leu leu ala ala leu arg ^leu ser ala 

GGT TCG CCT GGC GCG GCA CTC GCG TTC TTT CAG OGA GAT AAC TOG CAC GCT CGT GAA ACA 
gly ser pro gly ala ala leu ala leu phe gin gly asp asn trp gin ala arg. glu thr 

^ 5'^ peptide 

TTC TCT CAG GCG TTC CCA TAT AGC CTG CCA TCG OCC GAC TGG TAT TCG CTC CTA GCG CCC 
leu cys gin ala leu ala tyr ser val pro ser gly asp trp tyr ser leu leu ala ala 

CTT AAT CAT GAA CAA GCT CCG GCG CGT TTA CAC TCX5 CTC OCA ACG TTC CTC ATC GAT GCG 
leu asn his glu gin ala pro ala arg leu his trp leii ala thr leu leu met asp ala 



202 
(15) 

262 
(35) 

322 
(55) 

382 
(75) 

442 
(95) 

502 
(115) 

562 
(135) 

622 
(155) 

682 
(175) 

742 
(195) 

802 
r215) 

862 
(235) 

922 
(255) 



CTA AAA CGC CAT CAT OCT OCT GCG CAC CTC ACC AAT CTT GAT CTC CCG OGC CTC GTC GCC 982 

leu lys arg his his gly ala ala gin val thr asn val asp val pro gly leu val ala (275) 

CAA CTC CCA AAC CAT CTT TCT CCC TCC CGC CTC CAG CCT ATA CTC GOG GAT CTT TGC CAC 1042 

glu leu ala asn his leu ser pro ser arg leu gin ala ile leu gly asp val cys his (295) 

ATT COT GAA CAG. TTA ATC TCT GTT ACA OGC ATC AAC CGC GAG CTT CTC ATC ACC GAT CTT 1102 

ile arg glu gin leu met ser val thr gly ile asn arg g^u leu leu lie thr asp leu (315) 



TTA CTC CGT ATT GAG CAT TAC CTC CAA CCG GGC CTT CTC CTA CCG CTT CCT CAT CTT TAA 
leu leu arg ile glu his tyr leu gin pro gly val val leu pro val pro his leu * 



1162 
(334) 



GAGAGACATC ATOrmTAG TCGACTCACA CTGCCATCTC GATOGTCTOG ATTATCAATC TTTCCATAAG 1232 
GACGTOGATC ACGTTCTOGC GAAAGCCGCC CCACGCGATC TCAAATTTTC TCTOGCAGTC GCCACAACAT 1302 

Fig. 3. Sequence of holB encoding DNA sequence of holB (upper case) and the predicted amino acid sequence of 6' (hwer case) 
predicts a 334-amino acid protein of 36,937 Da. The amino-terminal and tryptic peptide sequences obtained from the naturally purified 5' 
are underlined. The stop codon is marked with an asterisk. The putative translational signal (Shine-Dalgamo (S.D. )) is underlined. Numbering 
of the nucleotide sequence and amino acid sequence (in parentheses) is presented to the right. 
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Fig. 4. The S and 6' expression vectors. Construction of pET- 
6 and pET-5' is described under "Experimental Prooedures" The 
holA insert in pET-6 and holB insert in pET-«' are shown above the 
pET3c expression vector. The initiating ATG of these genes is posi- 
tioned downstream of the Shine-t)algamo sequence (S.D.) and a T7 
promoter. Downstream of holA ^le 492 bp of K. coU DNA and 591 bp 
of M13mpi8 DNA. The holB insert contains 158 bp of E. coU DNA 
downstream of holB to an Ndel site. Tlie T7 RNA polymerase 
termination sequence is downstream of the insert. 
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Table I 

Purification of 5 

BL21(DE3) pET-5 cells were grown at 37 "C in 12 liters of LB 
containing 1.2 g of ampicillin. Upon growth to an Aqoo of 1.5 the 
temperature was lowered to 25 'C, and IPTG was added to 0.4 mM. 
After a further 3 h the cells (50 g) were collected by centrifugation. 
Cells were lysed using lysozyme as described (42), and the debris was 
removed by centrifugation. The purification steps that follow were 
performed at 4 'C. The assay for 6 is described under "Experimental 
Procedures," The clarified cell lysate (Fraction I, 300 ml) was diluted 
2rfold with buffer A to a conductivity equal to 112 mM NaCl then 
loaded onto a 60-ml hexylamihe-Sepharose column (Pharmacia LKB 
Biotechnology Inc.) equilibrated with buffer A + 0.1 M NaCL The 
hexylamine column was washed with 60 ml of buffer A + O.I M NaCl 
then eluted over a period of 14 h using a 600-ml linear gradient of 
0.1-0.5 H NaCl in buffer A. Eighty fractions were collected. Fractions 
16-34 (Fraction II, 125 ml) were dialyzed against 2 liters of buffer A 
+ 90 mM NaCl overnight tiien diluted 2-fold with buffer A to yield a 
conductivity equal to 65 mM NaCl just priQr to loading onto a 60-ml 
' column of heparin-Sepharose (Pharmada) equilibrated in buffer A + 
50 mM NaCL The heparin-Sepharose column was washed with 120 
ml of buffer A + 50 mM NaCl and then eluted over a period of 14 h 
using a 600-ml linear gradient of 0.0&-0.5 M NaCl in buffer A. Eighty 
fractions were collected. Fractions 24-34 (Fraction HI) were pooled 
and diluted 3-fold (final volume, 250 ml) with buffer A to a conduc- 
tivity equal to $5 mM NaCl jxist priot' t0 loading onto a 50-ml Hi Load 
26/10 Q-Sepharose fast flow fast protein liquid chromato^aphy col- 
umn (Pharmacia). The Q-Sepharose colimm was washed with 150 ml 
of buffer A + 50 mM NaCl and then eluted over a period of 10 h using 
a 600-ml linear gradient of 0.05-0.5 M NaCl in buffer A. Eighty 
fractions were collected. Fractions 28-^6 were pooled (Fraction IV, 
74 nd, 1.9 mg/ml) and passed over a 1-ml ATP-Sepharose column 
(Pharmacia, type II*(N-6 linked)) to rempye any possible y complex 
contaminant and then dialyzed versus two changes of 2 liters each of 
buffer A conttdning 0.1 M NaCl (the DTT was omitted for the purpose 
of determining protein concentration spectrophotordetrically) before 
storing at — 70'C. Protein concentration was determined by the 
Bradford method (43) using BSA as a standard except at the last step 
in which the concentration was determined by absorbance using 
= 46,137 M"* cm"^ 



-SSB 



Fig. 5, Purification of 5. Panel A, Coomassie Blue-stained 13% 
SDS-polyacrylamide gel of column pools. First Ume, total ceUs, un- 
induced Second lane, total cells, induced. Third lane^ cell lysate 
supernatant (70 fig). Fourth (one, hexylamine-Sepharose pool (15 /ig). 
Fifth lane, heparin-agarose pool (7 fig). Sixth 2ane, Q-Sepharose pool 
(7 /£g). Positions of wei^t niarkers are to the left. Panel B, comparison 
of the mobility of cloned 5 {first three lanes, 4, 2, and 1 ^ig, respec- 
tively) with d in poinT (fourth lane, 4 /«g). Submiits of polHT and 
the SSB which copurified mth it.are identified to the right. 

holA shows a near noite^tQ pie^^^ site consensus 

motif {Le. AXGKS for y^resid^^^^ with 
the consensus Be(VX^hceB Wm or G/ 

^t this Bite remains^toM^mned.^^^^^^ 

IdentificationofholB^^^^ 
a 13% SDS-polyaciylanu^ 



(18). The slower 
half the abundance^fm 
S'j^ and «'8 are prob^««? 

(data not shown). I^^^^v 
J't, which had the ""^"SiiCMK 
had identical amino <^'^^ 



Fraction 


Total 


Total 
units* 


Specific 
activity 


Fold 
purification 


Yield 




mg 




units/mg 




% 


L Lysate*' 


2,070 


5.4 X 10^ 


2.6 X 10* 


1.0 


100 


n. Hexylamine 


446 


2.5 X 10* 


5.6 X 10« 


2.2 


46 


nL Heparin 


197 


2.0 X 10' 


10.2 X l(f 


3.9 


37 


IV. Q-Sepharose** 


141 


1.5 X 10* 


10.6 X l(f 


4.1 


28 



y apx>earsin 
|cl<>8ely spaced bands 
dl^ISP?**^™^ one- 
^'^"'^^-Vg's).Both 
as their 
Tiaimilar 
^'s and 
^^^■^ysiaalso 
jttota 6's and 



. * One unit is defined as 1 pmol of nucleotide incorporated per min. 

* Lysate of BL21(DE3) cells harboring the pKT3c vector yielded a 
specific activity of 10* units/mg. 

' Omission of y the assay pf the lysate resudted in a 2(X)-fold 
reduction of specific activity (1.2 X 10* units/mg). 

'Omission of y from the assay using pure 6 gave no detectable 
synthe^ 

6'-3 from 6'i..were identical). The amino terminus and five 
tryplic peptides of d'a and two tryptic peptides of 6't were 
sequeiwid A search of the translat^ GenBank (30) revealed 
no match to any of the 6' sequences. 

The hoIB gene was identified using two oligonucleotide 
probes in a Southern analysis of E. call genomic DNA digested 
with the eig^t Kohara restriction map enzymes. Imposing the 
restraint that the eight fragments from the Southern analysis 
must overlap at the holB gene, the Kohara map of the E, coll 
chromosome (36) was searched, and only one position of 
overlap of fragments of the observed size was present on the 
map. This position was located at 24.3 min (1,174 kb starting 
from thrA) (Fig. 2). A restriction fragment containing the 
overlapping region was subcloned from XE9G1 (236) and 
sequenced (Fig. 3). The open reading frame encodes the 
amino-terminal sequence and all six tryptic peptides obtained 
from 5'l and 5's [underlined in Fig. 3). 

The holB gene encodes a 334-amino acid protein of 36,937 
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FIG. 6. RepUcation activity of « and 5'. Panel A. the pure 5 
subunit was titrated into the repUcation assay which is descnbed 
under ^Experimental Procedures." Panel B, the pure 5' subumt was 
titrated into the repUcation assay which is descnbed under Expen- 

mental Procedures." 




Table II 
Puripcation of 6' 
300 liters of BL21(DE3) plysS. pET-S' cells were grown at 37 
in LB supplemented with 5 mg/ml glucose. 10 Mg/ml thiamine, 50 /xg/ 
ml thymine containing 100 /xg/ml ampiciUm. and 25 Mg/ml chloram- 
phenicol. Upon growth to an Aeoo of 0.5. IPTG was added to 0.2 mM. 
After further growth for 2 h the cells (940 g) were coUected by 
centrifugation. resuspended in an equal weight of 50 mM Tris-HCl 
(pH 7.5). 10% sucrose, and stored at -70 'C. 100 g of cells (30 liters 
of cell culture) were thawed whereupon they lysed (because of lyso- 
zyme produced by plysS), and cell debris was removed as described 
(42) to yield the ceU lysate (Fraction I, 4.41 g in 325 ml). The 
purification steps that follow were performed at 4 'C. The assay for 
y is described under "Experimental Procedures." Ammonium sulfate 
(021 g/ml) was added to step I and stirred for 90 min. The pellet 
contained 6' (Fraction H, 1.68 g) and was redissolved in 660 ml of 
buffer B and dialyzed against two changes of 2 liters each of buffer B 
to a conductivity equal to 40 mM NaCl. The Fraction H was loaded 
onto a 300-ml heparin-agarose column (Bio-Rad) equihbrated mth 
buffer B The heparin column was washed with 450 ml of buffer B + 
20 mM NaCl and then eluted over a period of 14 h using a 2.6-liter 
linear gradient of 20-300 mM NaCl in buffer B. One hundred firactioiis 
were coUected Fractions 36^ were pooled (Fraction m. 650 ml) 
and dialyzed twice against 2 Uters of buffer A to a conductivity ^pial 
to 60 mM NaCL The Fraction m was loaded onto a 100-ml Q- 
Sepharose column (Pharmacia) equilibrated witii buffer A. The Q- 
Sepharose column was washed witii 150 ml of buffer A + 20 mM NaCl 
and then eluted over a period of 12 h using a 1.2-Uter linear gradient 
of 20-300 mM NaCl in buffer A. Eighty fractions were collected. 
Fractions 34-56 were pooled (Fraction IV, 370 ml) and dialyzed twice 
acainst 2 Uters each of buffer A to a conductivity equal to 6(^ mM 
NaCl just prior to loading onto a 60-ml EAH-Sepharose column 
(Pharmacia's replacement for hexylamine-Sepharose) equiUbrated 
witii buffer A. The EAH-Sepharose column was wash^ YVfr" 
of buffer A + 40 mM NaCl and then eluted over a penod of 10 h with 
a 720-ml linear gradient of 40^ mM NaCl in buffer A- Eighty 
fractions were coUected Fractions 1&-30 (Fraction V, 130 ml), which 
contained homogeneous were pooled and dialyzed against 2 Uters 
of buffer A (lacking DTT to allow an absorbance measurement, see 
below) to a conductivity of 40 mM NaCL The Fraction V was. p^ 
over a 6-nil ATP-agarose column (Pharmacia, type U, N-6 bnkedjto 
remove any y complex contaminant followed by the addition of DTT 
to 2 mM and then was aliquoted and stored at -70 •C, Protem 
concentration was determined by the method of Bradford (43) usmg 
BSA as a standard except at the last step in which concentration was 
determined by absorbance, 



Fraction 



Total Total 
protein unitd* 



Specific 
activity 



Fold 
purifi- Yisld 
cation 



FlO 7. Purification ofV. Panel A. Coomassie Blue-stained 13% 
SDS-Polyacrylamide gel of column pools. First lane, molecular mass 
t^& fecorui torS total cells, uninduced Third to., tg 
induced. Fourth ione. cell lysate siyematant (41 /^)'^\^' 
ammonium sulfate pool (8 ^g). SudhJ^,^ b^^^'^^^^T^Lh 
Mg). Seventh lane. Q-Sepharose pool (6 pg). ^f^^J^^^f^^l 
arose pool (6 ^g). Molecular mass of the markers is to the ieA. and 
So^ofV^anda'aareindicatedtoth^ 

if the mobiUty of cloned 6' (second through fourth lanes, 0 5. 1.5. and 
4 5^.Tespe^vely) with a' within polHT (^t ion^. 4 Mg). Subumte 
oiATtaa off) and SSB which copurified with it are identified 
onK/t of the gel and «'l and aVare identi^^ 

Da (pi = 7.04)/ consistent with the mobiUty of 6' in an SDS- 
polyacrylamide gel (18. 20). Upstream of holB is a putative 
Shine-Dalgamo sequence (Fig. 3). It is not known whether 
holB is in an operon. The possibility of a gene directly up- 
stream of holB is indicated by an open reading frame (+1 
relative to holB) throughout the 158-bp upstream sequence 
which terminates with a TGA stop codon that overlaps the 
initiating ATG of holB. In addition, there is an ATG 10 bp 
downstream of holB without an in frame stop codon over the 



1. Lysate'-' 

n. Ammonium sulfate 
IIL Heparin 
IV. Q-Sepharose 

V. EAH-Sepharose" 



mg 
4,414 
1.684 
990 
781 
732 



3.0 X 10^ 

2.5 X 10*^ 

2.6 X 10" 
2.6 X 10" 
2.5 X 10" 



units/mg 

16 X 10* 
26 X 10« 
33X10* 
34X10* 



1.0 
2.3 
3.7 
4.7 
4.9 



100 
83 
87 
87 
83 



• One unit is defined as 1 pmol nucleotide incorporated in 20 

^Lysate of BL21(DE3) plysS cells harboring the pET3c vector 
yielded a specific activity of 1252 units/mg. , 

"^Omission of 7 and 5 from the assay of the lysate resulted m a 
7 650-fold reduction of specific activity (915 units/mg). , ^ _^ , , 

''Using pure 5', omission of 7 from the assay gave no detectable 
synthesis under the conditions of the assay. 

remaining 130 bp of sequence suggesting a possible down- 
stream gene. There is no obvious promoter for holB, consist- 
ent with the possibiUty that the holB gene is in an operon. 
Alternatively, the promoter may poorly match the consensus, 
as a low level of transcription may be expected for a subumt 
of a holoen2yme which is present at only low levels. The holB 
gene uses several rare codons (TTA (Leu). ACA (Thr). GGA 
(Gly). AGC, TCG (Ser)) two to four times more frequently 
than average (31) which may decrease translation efficiency. 
The holB gene contains a helix-tum-helix motif (Ala/ 
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kolA and holB 



o 



l«uS ripB holA 



orflorfS mrdA mrdS rlpA 




Fig. 8. Genes in the vicinity of holA on the E. coli chromo- 
some. The holA gene is located between the rlpB and mrdA genes at 
15.2 min on the E. coli chromosome. teu5. leucyl-tRNA synthase; 
rlpB, rare lipoprotein B; holA, 6 subimit of DNA polymerase HI 
holoenzyme; orfl, open reading frame for a 7.7-kDa protein (32); orf2, 
open reading frame for a 17.7-kDa protein (32)); mrdA (also pbpA), 
penicillin-binding protein 2; mrdB (also rodA) rodA protein; r(pA. 
rare lipoprotein A. 



tion of T. the 5' stimiilates the assay 30-fold (Fig. 6B). The 
«28o value of 6' calculated from the holB sequence is 59,600 
M"^ cm"\ only 0.9% lower than in the presence of 6 m 
guanidine hydrochloride for a native czso value of 60,136 M"^ 



cm 



-I 4 



GlyXjGlyXille/Val) at AlawGlyMVaUo although ability of 5' 
to bind DNA has yet to be examined. There is also a possible 
leucine zipper (Leu7X6LeuiiX6Gly2iXBLeu28) in the amino ter- 
minus although Gly interrupts the Leu pattern. The holB 
sequence does not contain an ATP binding site motif or a 
rinc finger. 

Purification of 5— The holA gene was cloned into M13mpl8 
followed by site-directed mutagenesis to create an Ndel site 
at the initiating methionine. The holA gene was then sub- 
cloned into the Ndel site of the pET3c expression vector (21) 
which places holA under control of a strong T7 RNA polym- 
erase promotor (Fig. 4). pET-6 was transformed into 
BL21(DE3) cells which harbor a X lysogen containing the T7 
RNA polymerase gene controlled by the lacUV5 promoter 
(21). Upon induction of T7 RNA polymerase with IPTG, 5 
was expressed to 27% total cell protein (compare Fig. SA, first 
lane (tminduced) and second lane (induced)). We purified 141 
mg of 6 (Table I) starting from 12 liters of induced cells 
(soluble fraction, Fig. 5A, third lane) upon column fractiona- 
tion using hexylamine-Sepharose (Fig. 5A. fourth lane), hep- 
arin agarose ififl:h lane), and Q-Sepharose (sixth lane). 5 
tended to precipitate upon standing in low salt (<70 mM), 
especially during dialysis. Therefore, low salt was avoided 
eiWjept for short periods of time, and column fractions con- 
taining 5 were sometimes diluted for the next column rather 
than dialyzed overnight. Cloned 6 comigrated with 6 within 
poUir (holoenzyme lacking /3) (Fig. 5B). The 5 subunit was 
assayed as described in an earUer study (18), wHch demon- 
strated its requirement to reconstitute the rapid and proces- 
sive polymerase from the jS and 7 subunits and ae polymerase 
(Fig. 6A). The C2so value calculated from the holA sequence is 
46 230 M"^ cm"\* The measured absorbance of 6 in 6 M 
guanidine hydrochloride is only 0.2% higher than in its native 
state. Hence, the ezso of native 5 is 46,137 M"" cnTK 

Purification of 5'— hoZB was cloned into pET3c (Fig. 4) and 
transformed into BL21 (DE3) plysS. Upon induction of resi- 
dent T7 RNA polymerase with IPTG, 5' was e:q)ressed to 
50% of total cell protein (compare Fig. 7A, second lane (un- 
induced cells) and third lane (induced cells)). Upon cell lysis, 
the majority of 5' was soluble (Fig. 7A. fouHh lane). We 
purified over 700 mg of 5' (Table II) startmgfirom 30 Uters of 
cell culture. The cell lysate was fractionated first by ammo- 
nium sulfate precipitation {f^h lane) and then by successive 
column fractionation steps using heparin-agarose {sixth lane), 
Q-Sepharose (seuent/i lane), and EAH-Sepharose (eighth 
Ume). The 6' subunit was assayed by its ability to stimulate 
reconstitution of the highly processive polymerase with the 
6, and /? subunits and ae polymerase (12). In the absence 
of 6', this assay requires a large excess of 7 subimit (described 
in the accompanying report, Ref. 19). Using a low concentra- 



The overproduced 5' consists of a doublet just as observed 
previously for naturally purified 5' (18). Fig. IB shows that 
the two polypeptides of cloned 6' comigrate with those of 6' 
in pollir. As with naturally purified 6', the abundance of 6'l 
is approximately 50% the abundance of S's- A trivial expla- 
nation for the two polypeptides is that (S's) is a proteolytic 
product of However, electrospray mass spectrometry 
showed the major species, 6'si bad a mass of 36,930 Da, which 
is the mass predicted bom the entire holB gene sequence, 
indicating that ^'s is not the result of proteolytic degradation.® 
The nature of 5'l is presently under investigation. 



4 ipjjg fjgg value calculated from the amino acid con^xiation of a 
nrftfPin is within 6% of the czao value in the native state using the 
S^n: « Trp. (5690 M-ycm";) + TVr» (1280 M- cm-)where 
m and n are the number of Trp and Tyr residues in «, respectively 

(39). 



DISCUSSION 

Chromosomal Context of holA^lt is interesting to note that 
holA is in an area of the chromosome containing several 
membrane protein genes (Fig, 8). They are all transcribed in 
the same direction. The mrdA fiuxd mrdB genes encode pro- 
teins responsible for the rod'ib^ of E. coli, and the r^pA, 
rlpB genes encode rare lipoproteins speculated to be important 
to cell duplication (32). The position of holA within a cluster 
of membrane proteins may be coincidental or may help coor- 
dinate dupUcation of the cell and the chromosome. 

The 5' Doublet — d' in the y complex and the cloned d' 
appear as a doublet in a 13% SDS-polyacrylamide geL Elec- 
trospray mass spectrometry of the faster migrating species, 
6'8, showed it was the size of the full-length gene and not 
derived by proteolysis, suggesting that 6'u which migrates 
slower than ^'s, may be a modified form of increased size. 
Possible modifications include mRNA splicing, use of an 
upstream ATG, read-through of the stop codon, translational 
frame shifting, and covalent modification. Amino-terminal 
sequence analysis of the cloned .6'l and d'a subunits showed 
them to have identical amino termini, proving that 5'l is not 
derived from an alternate upstream ATG start site (data not 
shown). Treatment with calf intestinal and bacterial alkaline 
phosphatases did not effect the mobility of either d's or 5'u 
suggesting that serine and threonine phosphorylation are not 
involved (not shown) although phosphorylation of other res- 
idue or other types of covalent modification remains a pos- 
sibility. Translational read-through wouldj)roduce a protein 
containing 19 additional residues for an mcrease of 2130 Da. 
A -1 translational frame shift would produce a protein con- 
t^ining an additional 7 amino adds before encountering a 
stop codon in the —1 frame. We detect no obvious frame 
shifting signals in holB (Le. a stretch of A residues followed 
by a hairpin; or.a XXX yyyZ site). 

5' Is Homologous to y and t— A homology search of the 
translated GenBarik showed that the most homologous pro- 
tem to 6' was another E. coli protein, the y/r 8ubunit(8) of 
DNA polymerase III holoenzyme. There is 27% identity and 
44% similarity including conservative substitutions over the 
entire lengths of 5' and y/r (Fig. 9). One particular region in 
6' of 50 amino acids (amino acids 110-159) is strikingly similar 
to y/r (amino acids 121-170) having 49% identity. 

Tlie extent of sequence homology between 6'. and y/r 
subunits is above the level required to speculate that they 
have similar three-dimensional structures. What function 
does such structural redimdancy within one multiprotein ma- 



* Electiospray mass spectrometry of fi's was performed by Dr. 
William S. Lane of Harvard Microchemistry. The d's was separated 
from 6'l by reverse phase HPLC prior to analysis. 
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Delta prime 
Gamma /Tau 



1 20 40 60 

MRWYPWLRPD FEKLVASYQAGRGHHALLIQALPGMGDDALIYALSRYLLCQQPQGHKSCGHCRGCQLMQA 

Ml I I 111111:1:1:: I : : I 1 Mil: 

MSYQVIJJIKWRPQTFADWGQEHVLTALMXSLSLGRIHHAYLFSGTRGVGKTSIARLL^^ 



80 100 120 140 

GTOPDYYTL-APEKGKNTI/^AVRE\n^Kl^HARLGGAKW^ 

- I I . I . . . . I II : : : 1 : : I I I I I I I 1 M I 1111:1:1,1:111 



160 

-RLHYLAPPPE 
I 1 I 



GRFVDLIEID]ASRTK----V^OTRiLLWQYAPARGRFk^ 



QIRHi 



220 240 260 

-GDNWQARETLCQAIAYSVPSGDWYSLLAAI^EQAPARL^^ 

[QLEHil^EHiMEPRlioLlJu^ 



180 200 
- -OYAVTWLSREVTOSQD- -ALLAALRLSAGSPGAALALFQ- 



280 300 320 

AA-QVTOTOVPGLVAEIA-NHLSPSRWAIICTV-CHIREQI^^ <334) 

IfMElsLLvikiiLHKll^^ ^^^^^ 

Fir 9 Seauence homology between 6' and the 7/rsubaidts.Angiment of the 
of ? J Dl^^Ce^^^^ Gaps were introduced to maximize homology. Identical ami^o acids have a <i«!/» between them 

tfcTnsS^^ indiSby the dots. Numbering of the sequence of 5' is presented abg^ the sequence. Only the first 351 

amino acids of y/r are shown (total = 431 for 7 and 643 for t). 
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chinery serve? It seems possible that the most similar regions 
of structure among these proteins could serve the same func- 
tion and that the more dissimilar regions may have different 
functions. Hence, the couplmg of one process to different 
events may be served by this "protein family" within the 
holoenzyme particle. \ 

The replicative polymerases of phage T4 and eukaiyotes 
also have multiple accessory protems (1). The accessory pro- 
teins of the eukaryotic polymerase 5 (and c) are the 5 protein 
. RF-C complex (or A-1) and the proliferating ceU nuclear 
~ antigen protem. The accessory proteins of the T4 polymerase 
are the 44/62 complex and the gene 45 protein. 

Recently we have found that the 5' and y/r subunits of the 
E. coU y complex have a significant level of homology to the 
36-, 37-, 38-, and 40-kDa subunits of the human RF-C complex 
and to the gene 44 protein of the phage T4 44/62 complex 
(38). The homology was especially strong in the 60-amino 
acid region in which 6' and y/r are most homologous. Previ- 
• ously we outlined a homology alignment between P (E. coK)* 
proliferating cell nuclear antigen (yeast and htunan), and gene 
45 protein (phage T4) using the structure of j3 as a guide (10). 
The homology in both function and sequence among accessory 
proteins spanning the range from E. coli to humans suggests 
that the basic mechanism of DNA repUcation has been con- 
served throughout evolution. 
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The hoIA gene encoding 5 

hoIA (5) (38.7 kDa) 



I I 1029 bp total 



Takase et. al. 



rlpB(19kDa) , | 

230 bp of holA 



O'Donnell 



(11 of the first 54 bp 
were incorrect sequence. 
Also, this was unrecognized 
by Takase as a reading frame) 



